Fetch album covers from Amazon

Edit: This does not seem to work anymore.

This is a stripped-down snippet from sonata to fetch music album covers from the amazon website.

[download amazon.py]

#!/usr/bin/env python
# Copyright 2006-2007 Scott Horowitz
# Licensed under the GPL
import urllib2, urllib
def download_image_to_filename(artist, album, dest_filename): 
    # Returns False if no images found
    imgfound = False
    img_url = ""
    # Amazon currently doesn't support utf8 and suggests latin1 encoding instead:
        artist = urllib.quote(artist.encode('latin1'))
        album = urllib.quote(album.encode('latin1'))
        artist = urllib.quote(artist)
        album = urllib.quote(album)
    amazon_key = "12DR2PGAQT303YTEWP02"
    search_url = "http://webservices.amazon.com/onca/xml?" \
                 + "Service=AWSECommerceService&AWSAccessKeyId=" \
                 + amazon_key \
                 + "&Operation=ItemSearch&SearchIndex=Music&Artist="\
                 + artist \
                 + "&ResponseGroup=Images&Keywords=" \
                 + album
    request = urllib2.Request(search_url)
    opener = urllib2.build_opener()
    f = opener.open(request).read()
    curr_pos = 300    # Skip header..
    curr_pos = f.find("<LargeImage>", curr_pos+10)
    url_start = f.find("<URL>http://", curr_pos)+len("<URL>")
    url_end = f.find("</URL>", curr_pos)
    img_url = f[url_start:url_end]
    urllib.urlretrieve(img_url, dest_filename)
    imgfound = True
    return imgfound
if __name__ == "__main__":
    import sys
        artist = sys.argv[1]
        album = sys.argv[2]
        outfile = sys.argv[3]
        print 'Usage: %s artist album outfile' % sys.argv[0]
    res = download_image_to_filename(artist, album, outfile)
    print res

Example use from the command-line:

./amazon.py "Boards of Canada" "The Campfire Headphase" boc.jpg

Minimal webserver in python

There is a minimal web server in the python standard library. To use it, change to the directory you want to serve and execute the script below:

[download simplewebserver.py]

#!/usr/bin/env python
import SimpleHTTPServer, BaseHTTPServer
except KeyboardInterrupt:
    print '^C received, bye bye!'

By default, the directory is served on port 8000, so point your browser to http://localhost:8000/ to see your freshly online website. Alternatively, you can specify the port as first argument. For example:

 > simplewebserver.py 1234

Regexp in python: extract email addresses

An example of using regular expressions in python to extract the email addresses from a text file.

[download extract-email.py]

#!/usr/bin/env python
Description: extract email addresses from stdin to stdout
Usage example: cat whatever.txt | extract-email.py | sort -fu > addr.txt
import re, sys
email_pattern = re.compile('([\w\-\.]+@(\w[\w\-]+\.)+[\w\-]+)')
for line in sys.stdin:
    # there are several matches per line
    for match in email_pattern.findall(line):
        print match[0]

Backup script (rsync via ssh)

This is my totally not original backup script. I have symbolic links sitting in a directory, to files and folders I want to save. This directory is saved on a remote host.

[download backup.sh]


set -u
set -e

# The directory to be backed up

# The remote host

# The directory on the remote host

# rsync options reminder:
#   -a archive mode
#   -v verbose
#   -z compression during transfer
#   --delete remove files that were deleted in the source directory
#   --copy-unsafe-links
#        when a symlink is encountered, the referent is copied
#        instead, *only* if it is out of the saved tree

rsync -avz --delete --copy-unsafe-links -e ssh $DIR $RHOST:$RDIR