Topics covered in this tutorial
- General Background
- Installing the opencage R package
- Forward geocoding
- Reverse geocoding
- Geocoding a list of places
- Geocoding a list of coordinates
- Troubleshooting / Testing API connection
- Alternatives for large datasets
- Further reading
Background
Before we dive in to the tutorial
- Sign up for an OpenCage geocoding API key.
- Play with the demo page, so that you see the actual response the API returns.
- Browse the API reference, so you understand the optional parameters, best practices, possible response codes, and the rate limiting on free trial accounts.
Install the opencage R package
The opencage R package is developed and maintained by rOpenSci and is available on CRAN.
Install stable version from CRAN
install.packages("opencage")
Install development version from R-universe
install.packages("opencage", repos = "https://ropensci.r-universe.dev")
Install development version from GitHub
# install.packages("pak")
pak::pak("ropensci/opencage")
Set up your API key
Store your API key as an environment variable. The recommended approach is
to add it to your
.Renviron
file (located in your home directory). Open it for editing:
file.edit("~/.Renviron")
Add this line to the file and save:
OPENCAGE_KEY=YOUR-API-KEY
Restart R for the changes to take effect.
Alternatively, set the key for your current session only:
Sys.setenv(OPENCAGE_KEY = "YOUR-API-KEY")
Geocode a single address or place name
The opencage package provides two forward geocoding functions:
-
oc_forward_df- returns a tibble (data frame) directly, convenient for tabular workflows. -
oc_forward- returns a list with support for different output formatsdf_list,json_list,geojson_list, andurl_only.
Use
oc_forward_df
when you want results directly as a data frame, for example to add columns to
an existing tibble. Use
oc_forward
when you need the raw list structure or alternative output formats like GeoJSON.
oc_forward_df
example:
library(opencage)
result <- oc_forward_df(placename = "Bordeaux, France")
print(result)
# A tibble: 1 × 4
# placename oc_lat oc_lng oc_formatted
# <chr> <dbl> <dbl> <chr>
# 1 Bordeaux, France 44.8 -0.580 Bordeaux, Gironde, France
Show list of all available data frame variables
placename <chr>
oc_lat <dbl>
oc_lng <dbl>
oc_confidence <int>
oc_formatted <chr>
oc_northeast_lat <dbl>
oc_northeast_lng <dbl>
oc_southwest_lat <dbl>
oc_southwest_lng <dbl>
oc_iso_3166_1_alpha_2 <chr>
oc_iso_3166_1_alpha_3 <chr>
oc_category <chr>
oc_type <chr>
oc_normalized_city <chr>
oc_city <chr>
oc_city_district <chr>
oc_continent <chr>
oc_country <chr>
oc_country_code <chr>
oc_county <chr>
oc_county_code <chr>
oc_house_number <chr>
oc_postcode <chr>
oc_quarter <chr>
oc_road <chr>
oc_state <chr>
oc_state_code <chr>
oc_state_district <chr>
oc_suburb <chr>
oc_forward
example:
result <- oc_forward(placename = "Bordeaux, France")
result
# [[1]]
# # A tibble: 1 × 68
# confidence formatted ...
# Access the coordinates from the list
result[[1]]$oc_lat
result[[1]]$oc_lng
Convert coordinates to an address (reverse geocoding)
library(opencage)
result <- oc_reverse_df(latitude = 51.5034070, longitude = -0.1275920)
print(result)
# A tibble: 1 × 3
# latitude longitude oc_formatted
# <dbl> <dbl> <chr>
# 1 51.5 -0.128 10 Downing Street, Westminster, London, SW1A 2AA, United Kingdom
# Access individual components, see the forward example for a list of variable names
result$oc_postcode
# [1] "SW1A 2AA"
result$oc_city
# [1] "London"
With
oc_reverse
the result is a list instead of a data frame:
result <- oc_reverse(latitude = 51.5034070, longitude = -0.1275920)
result[[1]]$oc_formatted
# [1] "10 Downing Street, Westminster, London, SW1A 2AA, United Kingdom"
Geocode a list of places
In this step we use a city list from our example address and coordinates lists.
Using dplyr and write results to a new file
library(opencage)
library(readr)
library(dplyr)
# Limit to 20 cities for testing
cities <- read_csv("cities.csv", n_max = 20)
results <- cities %>%
rowwise() %>%
mutate(geo = list(oc_forward(city))) %>%
mutate(lat = geo$results$geometry$lat,
lng = geo$results$geometry$lng) %>%
select(-geo)
write_csv(results, "cities_geocoded.csv")
print(results)
Without dplyr and filling the same data frame
library(opencage)
library(readr)
# Read CSV
cities <- read_csv("cities.csv")
# Prepare result columns
lat <- numeric(nrow(cities))
lng <- numeric(nrow(cities))
# Loop through each city
for (i in seq_len(nrow(cities))) {
city_name <- cities$city[i]
# Print progress to stderr
message(sprintf("[%d/%d] %s", i, nrow(cities), city_name))
result <- oc_forward(city_name)
if (length(result$results) > 0) {
lat[i] <- result$results$geometry$lat
lng[i] <- result$results$geometry$lng
} else {
lat[i] <- NA
lng[i] <- NA
}
}
# Add results to the data frame
cities$lat <- lat
cities$lng <- lng
print(cities)
Geocode a list of coordinates
Reverse geocode a list of coordinates from our example address and coordinates lists.
library(opencage)
library(readr)
# Read CSV with coordinates
coords <- read_csv("coordinates100_fr.csv", n_max = 20)
# Prepare result column
address <- character(nrow(coords))
# Loop through each coordinate pair
for (i in seq_len(nrow(coords))) {
# Print progress to stderr
message(sprintf("[%d/%d] %.4f, %.4f", i, nrow(coords),
coords$latitude[i], coords$longitude[i]))
result <- oc_reverse(coords$latitude[i], coords$longitude[i])
print(result)
if (length(result$results) > 0) {
address[i] <- result$results$formatted
} else {
address[i] <- NA
}
}
# Add results to the data frame
coords$address <- address
print(coords)
Troubleshooting / Testing API connection
If you are having trouble connecting to the OpenCage API, you can use R's built-in URL fetching to test if the API is reachable. If this returns an error, check your internet connection and firewall settings.
response <- readLines("https://api.opencagedata.com/ping")
print(response)
# Should return: "pong"
You can also use
oc_forward
with
return = "url_only"
to see the exact API request URL being sent:
oc_forward(placename = "Bordeaux, France", return = "url_only")
# "https://api.opencagedata.com/geocode/v1/json?q=Bordeaux%2C+France&key=..."
Alternatives for large datasets
R processes geocoding requests sequentially in a single thread. For large datasets with millions of addresses, this approach may be slow. Consider these alternatives:
- Use our command line interface (CLI) which supports batch processing with multiple workers.
- For very large datasets, consider using Python with async processing capabilities.
Before you start geocoding at high volume, please read our guide to geocoding large datasets where we explain various strategies and points to consider.
Alternative R packages
- tidygeocoder - a unified high-level interface for geocoding
2,500 geocoding API requests/day - No credit card required