This is a tutorial for using the OpenCage geocoding API in Stata.
The code examples below will use your API key once you log in.
Before we dive in to the tutorial you can
opencagegeois a Stata module written by Lars Zeigermann to access the OpenCage Geocoding API. You can find the newest version here.
Install (or update) opencagegeo
* Install the Stata module and two required user-written stata libraries from SSC: . ssc install opencagegeo . ssc install libjson . ssc install insheetjson * If you already have opencagegeo installed make sure you have the newest version . adoupdate opencagegeo, update
Batch geocode addresses (forward geocoding)
* If you have a dataset of addresses stored in a single string variable 'address' . opencagegeo, key(YOUR-API-KEY) fulladdress(address) * If your addresses are stored in separate variables, e.g. house number in 'num', street name in 'str', city in 'city', and country in 'ctry': . opencagegeo, key(YOUR-API-KEY) number(num) street(str) city(city) country(ctry)
Batch geocode coordinates (reverse geocoding)
* To geocode coordinates stored in a single variable 'coords' in the following format: latitude,longitude . opencagegeo, key(YOUR-API-KEY) coordinates(coords) * If your coordinates are stored in two separate variables 'lat' and 'lng' . opencagegeo, key(YOUR-API-KEY) latitude(lat) longitude(lng)
Geocoding a single address or pair of coordinatesTo geocode a single address or coordinates, you can use
opencagegeoithe immediate version of
* First you need to save your API key to a global macro 'mykey' . global mykey YOUR-API-KEY . opencagegeoi YOUR-ADDRESS-HERE . opencagegeoi YOUR-LATITUDE,YOUR-LONGITUDE
. help opencagegeo
Troubleshooting common problems
- If your dataset is of any significant size (you have more than 20,000 locations to geocode) please read our guide to geocoding large datasets where we explain various strategies and points to consider.
Unfortunately Stata does not do well with parsing API responses that
contain place names with apostrophes in the place name. For example the
Earl's Courtarea of London. The problem is the apostrophes in our JSON response cause Stata's JSON parsing engine to die, thus causing the program to die. Adding to the confusion the
opencagegeomodule unhelpfully then falls back to a default error message which says
Invalid key, rate limit exceeded or no internet connectionwhich is simply incorrect. You can test your API key by clicking on the "Sample request using this key" link in your account dashboard. So, how can you solve this and go forward with your geocoding? The only solution we have found is to determine which of your queries is causing the response which leads to the problem and exclude that query from your data set. A tediuos process, we know, and sympathise. The other option is to use another programming language like R, Python, Matlab, etc. Happily, we have tutorials for all of those languages. but we also appreciate that it is not easy to jump to another language. Sorry. We welcome all suggestions as to how to prevent this bug. If anyone from StataCorp is reading this, please get in touch and we can supply examples. Stata is the only language where this seems to happen.
In older versions of this software there is an optional parameter
paidkeywhich needs to be set if you are an OpenCage customer, so that the software can deal with the slight difference in format between free trial and paid responses. This is not needed in the newest version.