This is a tutorial for using the
OpenCage geocoding API
in Stata.
Topics covered in this tutorial
- General Background
- installing opencagegeo
- Geocoding a single location
- Forward geocoding
- Reverse geocoding
- Troubleshooting / Common Problems
- Further reading
Background
The code examples below will use your geocoding API key once you
log in.
Before we dive in to the tutorial
- Sign up for an OpenCage geocoding API key.
- Play with the demo page, so that you see the actual response the API returns.
- Browse the API reference, so you understand the optional parameters, best practices, possible response codes, and the rate limiting on free trial accounts.
Install (or update) opencagegeo
opencagegeo
is a Stata module written by
Lars Zeigermann
to access the OpenCage Geocoding API.
You can
find the newest version here.
* Install the Stata module and two required user-written stata libraries from SSC:
. ssc install opencagegeo
. ssc install libjson
. ssc install insheetjson
* If you already have opencagegeo installed make sure you have the newest version
. adoupdate opencagegeo, update
Geocoding a single address or pair of coordinates
To geocode a single address or coordinates, you can useopencagegeoi
the immediate version of
opencagegeo
* First you need to save your API key to a global macro 'mykey'
. global mykey YOUR-API-KEY
. opencagegeoi YOUR-ADDRESS-HERE
. opencagegeoi YOUR-LATITUDE,YOUR-LONGITUDE
Batch geocode addresses (forward geocoding)
* If you have a dataset of addresses stored in a single string variable 'address'
. opencagegeo, key(YOUR-API-KEY) fulladdress(address)
* If your addresses are stored in separate variables, e.g. house number in 'num', street name in 'str', city in 'city', and country in 'ctry':
. opencagegeo, key(YOUR-API-KEY) number(num) street(str) city(city) country(ctry)
Batch geocode coordinates (reverse geocoding)
* To geocode coordinates stored in a single variable 'coords' in the following format: latitude,longitude
. opencagegeo, key(YOUR-API-KEY) coordinates(coords)
* If your coordinates are stored in two separate variables 'lat' and 'lng'
. opencagegeo, key(YOUR-API-KEY) latitude(lat) longitude(lng)
Learn more
. help opencagegeo
Troubleshooting common problems
- If your dataset is of any significant size (you have more than 20,000 locations to geocode) please read our guide to geocoding large datasets where we explain various strategies and points to consider.
-
Unfortunately Stata does not do well with parsing API responses that
contain place names with apostrophes in the place name. For example the
Earl's Court
area of London. The problem is the apostrophes in our JSON response cause Stata's JSON parsing engine to die, thus causing the program to die. Adding to the confusion theopencagegeo
module unhelpfully then falls back to a default error message which saysInvalid key, rate limit exceeded or no internet connection
which is simply incorrect. You can test your API key by clicking on the "Sample request using this key" link in your account dashboard. So, how can you solve this and go forward with your geocoding? The only solution we have found is to determine which of your queries is causing the response which leads to the problem and exclude that query from your data set. A tediuos process, we know, and sympathise. The other option is to use another programming language like R, Python, Matlab, etc. Happily, we have tutorials for all of those languages. but we also appreciate that it is not easy to jump to another language. Sorry. We welcome all suggestions as to how to prevent this bug. If anyone from StataCorp is reading this, please get in touch and we can supply examples. Stata is the only language where this seems to happen. -
In older versions of this software there is an optional parameter
paidkey
which needs to be set if you are an OpenCage customer, so that the software can deal with the slight difference in format between free trial and paid responses. This is not needed in the newest version.