Geocoding your data

A lot of mapping programs want your data to be geocoded — that is, expressed as latitudes and longitudes. But a lot of time your data just has place names, not precise geocoordinates. To get these coordinates, you have several options:

UCLA Geography Geocoder

Batch geocoder (throttled so that it slows after 300 entries)

Google Fusion Tables can automatically geocode your data, but I’ve never found a way to export those latitudes and longitudes back out.

This add-on for Google sheets geocodes quickly.

Carto can do it automatically, but it does charge above a certain number of addresses.

To make things easier, you may want to filter all your addresses to the unique values, so that you’re only geocoding one instance of every address. Then geocoding goes much faster.

However, once you’ve done that, you’re left with two spreadsheets: one with geocoded addresses and one with just the addresses. How do you merge the two? You’ll use OpenRefine’s join function.

Open both of your spreadsheets in Open Refine. Let’s assume the original spreadsheet is called “Long Spreadsheet” and the spreadsheet with unique values in it is called “Short Spreadsheet”.

With Long Spreadsheet open in Open Refine, select the column that contains the address information. Then, using the tiny triangle drop-down menu at the top of the column, select Edit Column and then Add Column Based on This Column. In the window that pops up, replace the word “value” in the text box with

cell.cross(“Short Spreadsheet”, “City”).cells[“Longitude”].value[0]

This tells Open Refine to look in Short Spreadsheet, find the column called City and use that to find the corresponding longitude (stored in a column called Longitude). Then Open Refine will add a new column that contains that data.