It's fairly easy to find free worldwide city data online, from sources such as the MaxMind WorldCities database and the GeoNames data dump. Converting the provided CSV data into a usable hierarchy of countries, regions, cities and districts can be a bit of a pain though. That's why I created django-cities, which you can just plug right in to any Django project to get the whole hierarchy of data.
The project comes with a full dump of all of the GeoNames data after being converted to the hierarchical form and cleaned up, so if you use the project you won't have to worry about the source data at all. For those that are interested, or wish to perform different cleanup from me, the import script is included in the project.
For each country there's data about the population size, TLD, continent, and ISO country code (usually, but not always, the same as the TLD). With just the country data we can perform some interesting queries. For example, what country has the currently popular TLD "ly":
>>> from cities.models import * >>> Country.objects.get(tld='ly') <Country: Libya>
and what are the 5 most populated countries:
>>> Country.objects.order_by('-population')[:5] [<Country: China>, <Country: India>, <Country: United States>, <Country: Indonesia>, <Country: Brazil>]
Countries are broken down into regions, and then regions into cities. For each city we have the latitude and longitude, and population. The City object also has a couple of handy functions to find the nearest city to a given latitude and longitude:
>>> City.objects.nearest_to(51, 1) <City: Dymchurch, Kent, United Kingdom>
All of the models use the GeoManager, so we can perform more complex geo-based queries, and also combine geo queries with normal ones. For example, here we get the 5 closest cities to London that have a population of at least 250,000 people:
>>> london = City.objects.filter(region__country__name='United Kingdom').get(name='London') >>> cities = City.objects.distance(london.location).exclude(id=london.id).filter(population__gt=250000).order_by('distance')[:5] >>> for c in cities: ... print "%s (%0.2f miles)" % (c, c.distance.mi) ... Leicester, Leicestershire, United Kingdom (104.33 miles) Coventry, Warwickshire, United Kingdom (114.13 miles) Birmingham, Birmingham, United Kingdom (140.33 miles) Hull, Lincolnshire, United Kingdom (155.17 miles) Wolverhampton, St. Helens, United Kingdom (157.33 miles)
The largest cities are broken down into districts, for which there is also a latitude and longitude and population count. All of the models in the cities project have a
hierarchy property which returns a list that includes the current object and all it's parents. For a Country object this would of course just contain the Country object, and at the other extreme it would contain a District, City, Region and Country for a District object.
django-cities is already being used to great effect on the Geomium community page directory (along with the great famfamfam world flag icons), and I'm hoping it can be used by many others. If you do start using django-cities, or have any suggestions or comments, I'd love to know!