Make maps like a boss

When I worked on metabolism I didn't have any needs to plot data on an actual geographic area (although I always wished some form of coordinate system existed like that for my data). But in my switch to working with health data I now have tons of spatial data. Moreover this spatial component is a fairly important effect on the patterns and behavior that I observe.

So far I've been working with a GIS specialist (if you've looked at the maps in my small area estimation post those are from him), but that's not an ideal situation when I only need a quick picture or to look at how data is distributed in space. When I needed to place data on a map previously I typically needed interactivity, so I would make a visualization with D3.js. But now I just need to make lots of maps quickly to just do exploratory analysis. Fortunately, I was looking for a solution right around the time the PyData conference was happening and I saw Rob Story's presentation on Up and Down the Python Data and Web Visualization Stack". This talk turned me onto Folium, especially after seeing it work with iPython notebook (which has become one of my favorite workflow tools).

Making maps with Folium pretty damn easy. If you follow the examples on the Gitub page with the example data you will be making maps in no time. Now comes the fun part, making maps with our own data!

Step 1. Obtain Shapefile data

To plot our own data we will need to get a GeoJSON or TopoJSON file that contains the path information about how to draw boundaries on the map. Unfortunately, the predominant file type for geographic data is a shapefile and geojson or topojson files are pretty scarce on the web. For the USA though we can easily get maps at all geographic resolutions from the census/TIGER website.

Step 2. Converting from shapefile->geojson->topojson

Since the open source world is rocking geojson or topojson files we just need to convert our new shapefile over. To do this we will need to install ogr2ogr (which is a part of the GDAL package) and topojson. On OSX this can be accomplished with homebrew and node.js very simply by:

$ brew install gdal
$ brew install node
$ node install -g topojson

For my needs I needed to plot data on Illinois zip codes so I downloaded the ZCTA (zip code tabulation area, not the exact same as zipcodes but close enough for me) file for Illinois. Proceeding with this file we first convert the shapefile to a geojson like so:

$ ogr2ogr -f illinois.json tl_2010_17_zcta510.shp

Next we convert the geojson to topojson. The only additional wrinkle from the geojson conversion is that we need to set the ID on each one of the areas (which is the same ID that we'll be binding to when we add data to the map). We can open up the geojson file and look at any of the "properties" entries to find the name of the needed key from the file. For the ZCTA file this key is "ZCTA5CE10". With that we convert the geojson to a topojson setting the ID and adding a zipcode property
$ topojson --id-property ZCTA5CE10 -p zipcode=ZCTA5CE10 illinois.json -o illinois_topo.json
Step 3. Make one sweet map

This step is pretty simple. Take our new topojson file and the csv of data we have keyed on zipcode and throw it over the map like:

import folium
import pandas as pd
import numpy as np

df = bins = list(np.linspace(df['feature'].min(), df['feature'].max(), 6)) city_map = folium.Map(location=[41.8819, -87.6278], width='700', tiles='Stamen Toner', zoom_start=10) city_map.geo_json(geo_path='illinois_topo.json', topojson='objects.illinois', data=df, threshold_scale=bins, columns=['zipcode', 'feature'], key_on='', fill_opacity=1, line_opacity=1, fill_color='PuBuGn', reset=True) city_map.create_map('city.html')

and now we have a map which is embedded below! The official library as it stands only has support for up to six colors. This wasn't enough for me, so I have a fork of it with additional color support added on my github page that anyone is welcome to (it also has diverging color scales!)

pt src="">