Fetch album covers from Amazon

Edit: This does not seem to work anymore.

This is a stripped-down snippet from sonata to fetch music album covers from the amazon website.

[download amazon.py]

#!/usr/bin/env python
 
# Copyright 2006-2007 Scott Horowitz
# Licensed under the GPL
 
import urllib2, urllib
 
def download_image_to_filename(artist, album, dest_filename): 
    # Returns False if no images found
    imgfound = False
    img_url = ""
    # Amazon currently doesn't support utf8 and suggests latin1 encoding instead:
    try:
        artist = urllib.quote(artist.encode('latin1'))
        album = urllib.quote(album.encode('latin1'))
    except:
        artist = urllib.quote(artist)
        album = urllib.quote(album)
    amazon_key = "12DR2PGAQT303YTEWP02"
    search_url = "http://webservices.amazon.com/onca/xml?" \
                 + "Service=AWSECommerceService&AWSAccessKeyId=" \
                 + amazon_key \
                 + "&Operation=ItemSearch&SearchIndex=Music&Artist="\
                 + artist \
                 + "&ResponseGroup=Images&Keywords=" \
                 + album
    request = urllib2.Request(search_url)
    opener = urllib2.build_opener()
    f = opener.open(request).read()
    curr_pos = 300    # Skip header..
    curr_pos = f.find("<LargeImage>", curr_pos+10)
    url_start = f.find("<URL>http://", curr_pos)+len("<URL>")
    url_end = f.find("</URL>", curr_pos)
    img_url = f[url_start:url_end]
    urllib.urlretrieve(img_url, dest_filename)
    imgfound = True
    return imgfound
 
if __name__ == "__main__":
    import sys
    try:
        artist = sys.argv[1]
        album = sys.argv[2]
        outfile = sys.argv[3]
    except:
        print 'Usage: %s artist album outfile' % sys.argv[0]
        sys.exit(1)
    res = download_image_to_filename(artist, album, outfile)
    print res

Example use from the command-line:

./amazon.py "Boards of Canada" "The Campfire Headphase" boc.jpg


Minimal webserver in python

There is a minimal web server in the python standard library. To use it, change to the directory you want to serve and execute the script below:

[download simplewebserver.py]

#!/usr/bin/env python
 
import SimpleHTTPServer, BaseHTTPServer
 
try:
    BaseHTTPServer.test(SimpleHTTPServer.SimpleHTTPRequestHandler, 
                        BaseHTTPServer.HTTPServer)
except KeyboardInterrupt:
    print '^C received, bye bye!'

By default, the directory is served on port 8000, so point your browser to http://localhost:8000/ to see your freshly online website. Alternatively, you can specify the port as first argument. For example:

 > simplewebserver.py 1234

Regexp in python: extract email addresses

An example of using regular expressions in python to extract the email addresses from a text file.

[download extract-email.py]

#!/usr/bin/env python
 
"""
Description: extract email addresses from stdin to stdout
Usage example: cat whatever.txt | extract-email.py | sort -fu > addr.txt
"""
 
import re, sys
 
email_pattern = re.compile('([\w\-\.]+@(\w[\w\-]+\.)+[\w\-]+)')
 
for line in sys.stdin:
    # there are several matches per line
    for match in email_pattern.findall(line):
        print match[0]

Backup script (rsync via ssh)

This is my totally not original backup script. I have symbolic links sitting in a directory, to files and folders I want to save. This directory is saved on a remote host.

[download backup.sh]

#!/bin/bash

set -u
set -e

# The directory to be backed up
DIR="$HOME/backed.up/"

# The remote host
RHOST="whatever"

# The directory on the remote host
RDIR=`hostname`/

# rsync options reminder:
#   -a archive mode
#   -v verbose
#   -z compression during transfer
#   --delete remove files that were deleted in the source directory
#   --copy-unsafe-links
#        when a symlink is encountered, the referent is copied
#        instead, *only* if it is out of the saved tree

rsync -avz --delete --copy-unsafe-links -e ssh $DIR $RHOST:$RDIR