The problem
You need to geocode many locations quickly.
Background
Before we begin, it's important to understand what happens when you make an API request. Basically:
- Your request leaves your computer and crosses the internet to our servers.
- Our software answers the request.
- Our servers send the answer back to you, across the internet.
As you can imagine, the length of time for steps 1 and 3 will depend on where you are on the internet.
Batch or Bulk geocoding
One idea that often comes up is to send multiple locations per request,
or to send a file full of locations that can be processed and then be
downloaded after all locations are geocoded. This is often referred to as
"batch" or "bulk" geocoding.
We intentionally don’t support more than one location per request as our
(hard-earned) experience is that the conceptually much simpler
"one location, one request"
model is much less likely to lead to misunderstandings or errors of
implementation, thus saving engineering time, which is the most valuable
resource for almost all of our customers.
The way to process quickly is to make requests in parallel. Please
see details below.
Solutions
Caching
First of all, the fastest request is the one you don't make.
Unlike with many geocoding services that are built on non-open data, you can
store our results as long as you like, whether you are a customer or not.
Can you do more
caching
to reduce the number of requests you need to send us?
Do not use a proxy or VPN
If possible ensure your requests are coming directly to our servers, rather
than redirecting via a proxy.
Requesting in parallel rather than in series
Assuming that you looked in your cache, and decided that you do need to make
an API request, the way to
churn through your dataset quickly is to send many requests in parallel
(at the same time) rather than in series (one after another).
This option is not available to free trial users who are limited to one
request per second. Paying customers can send us many more requests
per second (you can see the exact numbers on
our pricing page).
How you run requests in parallel will depend on the programming language
you are using, but essentially it is as simple as having multiple services
running at the same time.
We have customers geocoding millions of locations per day, requesting in
parallel works.
We have a
command line tool for geocoding large files
and example scripts for making parallel requests
in Python (see the
"Running many parallel queries" section of our tutorial),
Node.js,
Ruby,
and
PHP.
Speeding up individual requests
Nevertheless, there are several things you can do to help us answer your
request more quickly (Step 2 in the list above).
-
Do you need the information in our
annotations?
If not adding the optional parameter
no_annotations=1
will skip that step and let us respond slightly more quickly. It also reduces the size of the response considerably (and thus reduces the amount of information we need to send back to you). - Please do everything you can to obey our best practices. Especially the forward geocoding query formatting guide. In general, the longer your query, the longer it will take us to respond.
-
We cache forward geocoding requests, unless you have specified
no_record=1
So it may be slightly faster if you don't use that optional parameter, though it will depend on how common your requests are. -
Finally, turning pretty printing off, ie NOT using
pretty=1
marginally reduces the response size.