coderholic

PyWebShot - Generate website thumbnails using Python

There have been lots of links to automatic website thumbnail generators on sites like reddit and hacker news today, including webkit2png and CutyCapt. Well it just so happens that a few weeks ago I wrote my own website thumbnail generator, and today I got around to putting it on GitHub.

The code is based on Matt Biddulph's screenshot-tng script, but heavily modified to be more user friendly and provide more options. It uses embedded mozilla for rendering, and therefore requires the python-gtkmozembed package.

You can specify a resolution to take the screenshot at, and also a resolution for the thumbnail. When generating the thumbnail the aspect ratio will be preserved. You can also specify a delay, so that the screenshot is only taken so many seconds after loading the page. Here's an example of running PyWebShot with 3 URLs, and the resulting images:

$ ./pywebshot.py -t 500x250 http://www.coderholic.com http://geomium.com/update/598/ http://jobs.plasis.co.uk
Loading http://www.coderholic.com... saved as www.coderholic.com.png
Loading http://geomium.com/update/598/... saved as geomium.com.update.598..png
Loading http://jobs.plasis.co.uk... saved as jobs.plasis.co.uk.png

It you have a huge list of URLs you'd like to generate screenshots for you can put them all into a file and generate images for them all with the following command:

$ cat urls.txt | xargs ./pywebshot.py

For more details and the source code see the PyWebShot project page on GitHub.

Posted on 11 Apr 2010
If you enjoyed reading this post you might want to follow @coderholic on twitter or browse though the full blog archive.