This is a tutorial for using the OpenCage Geocoding API in Stata.
Before you can query the API you will need to
sign up for an OpenCage API key.
Once you've done that we recommend you spend five minutes on:
* Install the Stata module and two required user-written stata libraries from SSC:
. ssc install opencagegeo
. ssc install libjson
. ssc install insheetjson
* If you already have opencagegeo installed make sure you have the newest version
. adoupdate opencagegeo, update
Batch geocode addresses (forward geocoding)
* If you have a dataset of addresses stored in a single string variable 'address'
. opencagegeo, key(YOUR-API-KEY) fulladdress(address)
* If your addresses are stored in separate variables, e.g. house number in 'num', street name in 'str', city in 'city', and country in 'ctry':
. opencagegeo, key(YOUR-API-KEY) number(num) street(str) city(city) country(ctry)
Batch geocode coordinates (reverse geocoding)
* To geocode coordinates stored in a single variable 'coords' in the following format: latitude,longitude
. opencagegeo, key(YOUR-API-KEY) coordinates(coords)
* If your coordinates are stored in two separate variables 'lat' and 'lng'
. opencagegeo, key(YOUR-API-KEY) latitude(lat) longitude(lng)
Geocoding a single address or pair of coordinates
To geocode a single address or coordinates, you can use
the immediate version of
* First you need to save your API key to a global macro 'mykey'
. global mykey YOUR-API-KEY
. opencagegeoi YOUR-ADDRESS-HERE
. opencagegeoi YOUR-LATITUDE,YOUR-LONGITUDE
Unfortunately Stata does not do well with parsing API responses that
contain place names with apostrophes in the place name. For example the
area of London. The problem is the apostrophes in our JSON response
cause Stata's JSON parsing engine to die, thus causing the program to
Adding to the confusion the
module unhelpfully then falls back to a default error message which says
Invalid key, rate limit exceeded or no internet connection
which is simply incorrect. You can test your API key by
clicking on the "Sample request using this key" link in
your account dashboard.
So, how can you solve this and go forward with your geocoding?
The only solution we have found is to determine
which of your queries is causing the response which leads to
the problem and exclude that query from your data set.
A tediuos process, we know and sympathise.
the other option is to use another programming language like R, Python,
Matlab, etc. Happily we have tutorials for all of those, but we also
appreciate that it is not easy to jump to another language.
Sorry. We welcome all suggestions as to how to prevent this bug.
If anyone from StataCorp is reading this, please get in touch and we can
supply examples. Stata is the only language where this seems to happen.
In older versions of this software there is an optional parameter
which needs to be set if you are an OpenCage customer, so that the
software can deal with the slight difference in format between free
trial and paid responses. This is not needed in the newest version.