
# R Geocoding Tutorial

## OpenCage Geocoding API R Tutorial

Tutorial for using the OpenCage Geocoding API in R - An API for reverse and forward geocoding using open geo data

This is a tutorial for using the [OpenCage geocoding API](https://opencagedata.com/api) in R.

### Topics covered in this tutorial

- General Background
- Installing the opencage R package
- Forward geocoding
- Reverse geocoding
- Geocoding a list of places
- Geocoding a list of coordinates
- Troubleshooting / Testing API connection
- Alternatives for large datasets
- Further reading

### Background

The code examples below will use your geocoding API key once you [log in](https://opencagedata.com/users/sign_in).

#### Before we dive in to the tutorial

1. [Sign up](https://opencagedata.com/users/sign_up) for an OpenCage geocoding API key.
2. Play with the [demo page](https://opencagedata.com/demo), so that you see the actual response the API returns.
3. Browse the [API reference](https://opencagedata.com/api), so you understand the [optional parameters](https://opencagedata.com/api#optional-params), [best practices](https://opencagedata.com/api#bestpractices), [possible response codes](https://opencagedata.com/api#codes), and the [rate limiting](https://opencagedata.com/api#rate-limiting) on free trial accounts.

### Install the opencage R package

[! [CRAN version badge](https://www.r-pkg.org/badges/version/opencage)](https://CRAN.R-project.org/package=opencage)

The [opencage R package](https://docs.ropensci.org/opencage/) is developed and maintained by [rOpenSci](https://ropensci.org/) and is available on [CRAN](https://cran.r-project.org/ "The Comprehensive R Archive Network").

#### Install stable version from CRAN

    install.packages("opencage")

#### Install development version from R-universe

    install.packages("opencage", repos = "https://ropensci.r-universe.dev")

#### Install development version from GitHub

    # install.packages("pak")
    pak::pak("ropensci/opencage")

[rOpenSci documentation](https://docs.ropensci.org/opencage/) [on GitHub](https://github.com/ropensci/opencage)

### Set up your API key

Store your API key as an environment variable. The recommended approach is to add it to your `.Renviron` file (located in your home directory). Open it for editing:

    file.edit("~/.Renviron")

Add this line to the file and save:

    OPENCAGE_KEY=YOUR-API-KEY

Restart R for the changes to take effect.

Alternatively, set the key for your current session only:

    Sys.setenv(OPENCAGE_KEY = "YOUR-API-KEY")

### Geocode a single address or place name

The opencage package provides two forward geocoding functions:

- `oc_forward_df` - returns a **tibble (data frame)** directly, convenient for tabular workflows.
- `oc_forward ` - returns a **list** with support for different output formats ` df_list `, ` json_list `, ` geojson_list `, and ` url_only`.

Use `oc_forward_df ` when you want results directly as a data frame, for example to add columns to an existing tibble. Use ` oc_forward` when you need the raw list structure or alternative output formats like GeoJSON.

`oc_forward_df` example:

    library(opencage)
    
    result <- oc_forward_df(placename = "Bordeaux, France")
    print(result)
    # A tibble: 1 × 4
    # placename oc_lat oc_lng oc_formatted
    # <chr> <dbl> <dbl> <chr>
    # 1 Bordeaux, France 44.8 -0.580 Bordeaux, Gironde, France

Show list of all available data frame variables

    placename <chr>
    oc_lat <dbl>
    oc_lng <dbl>
    oc_confidence <int>
    oc_formatted <chr>
    oc_northeast_lat <dbl>
    oc_northeast_lng <dbl>
    oc_southwest_lat <dbl>
    oc_southwest_lng <dbl>
    oc_iso_3166_1_alpha_2 <chr>
    oc_iso_3166_1_alpha_3 <chr>
    oc_category <chr>
    oc_type <chr>
    oc_normalized_city <chr>
    oc_city <chr>
    oc_city_district <chr>
    oc_continent <chr>
    oc_country <chr>
    oc_country_code <chr>
    oc_county <chr>
    oc_county_code <chr>
    oc_house_number <chr>
    oc_postcode <chr>
    oc_quarter <chr>
    oc_road <chr>
    oc_state <chr>
    oc_state_code <chr>
    oc_state_district <chr>
    oc_suburb <chr>

`oc_forward` example:

    result <- oc_forward(placename = "Bordeaux, France")
    result
    # [[1]]
    # # A tibble: 1 × 68
    # confidence formatted ...
    
    # Access the coordinates from the list
    result[[1]]$oc_lat
    result[[1]]$oc_lng

### Convert coordinates to an address (reverse geocoding)

    library(opencage)
    
    result <- oc_reverse_df(latitude = 51.5034070, longitude = -0.1275920)
    print(result)
    # A tibble: 1 × 3
    # latitude longitude oc_formatted
    # <dbl> <dbl> <chr>
    # 1 51.5 -0.128 10 Downing Street, Westminster, London, SW1A 2AA, United Kingdom
    
    # Access individual components, see the forward example for a list of variable names
    result$oc_postcode
    # [1] "SW1A 2AA"
    result$oc_city
    # [1] "London"

With `oc_reverse` the result is a list instead of a data frame:

    result <- oc_reverse(latitude = 51.5034070, longitude = -0.1275920)
    result[[1]]$oc_formatted
    # [1] "10 Downing Street, Westminster, London, SW1A 2AA, United Kingdom"

### Geocode a list of places

In this step we use a city list from our [example address and coordinates](https://opencagedata.com/tools/address-lists) lists.

#### Using dplyr and write results to a new file

    library(opencage)
    library(readr)
    library(dplyr)
    
    # Limit to 20 cities for testing
    cities <- read_csv("cities.csv", n_max = 20)
    
    results <- cities %>%
      rowwise() %>%
      mutate(geo = list(oc_forward(city))) %>%
      mutate(lat = geo$results$geometry$lat,
             lng = geo$results$geometry$lng) %>%
      select(-geo)
    
    write_csv(results, "cities_geocoded.csv")
    
    print(results)

#### Without dplyr and filling the same data frame

    library(opencage)
    library(readr)
    
    # Read CSV
    cities <- read_csv("cities.csv")
    
    # Prepare result columns
    lat <- numeric(nrow(cities))
    lng <- numeric(nrow(cities))
    
    # Loop through each city
    for (i in seq_len(nrow(cities))) {
      city_name <- cities$city[i]
    
      # Print progress to stderr
      message(sprintf("[%d/%d] %s", i, nrow(cities), city_name))
    
      result <- oc_forward(city_name)
    
      if (length(result$results) > 0) {
        lat[i] <- result$results$geometry$lat
        lng[i] <- result$results$geometry$lng
      } else {
        lat[i] <- NA
        lng[i] <- NA
      }
    }
    
    # Add results to the data frame
    cities$lat <- lat
    cities$lng <- lng
    
    print(cities)

### Geocode a list of coordinates

Reverse geocode a list of coordinates from our [example address and coordinates](https://opencagedata.com/tools/address-lists) lists.

    library(opencage)
    library(readr)
    
    # Read CSV with coordinates
    coords <- read_csv("coordinates100_fr.csv", n_max = 20)
    
    # Prepare result column
    address <- character(nrow(coords))
    
    # Loop through each coordinate pair
    for (i in seq_len(nrow(coords))) {
      # Print progress to stderr
      message(sprintf("[%d/%d] %.4f, %.4f", i, nrow(coords),
                      coords$latitude[i], coords$longitude[i]))
    
      result <- oc_reverse(coords$latitude[i], coords$longitude[i])
      print(result)
    
      if (length(result$results) > 0) {
        address[i] <- result$results$formatted
      } else {
        address[i] <- NA
      }
    }
    
    # Add results to the data frame
    coords$address <- address
    
    print(coords)

### Troubleshooting / Testing API connection

If you are having trouble connecting to the OpenCage API, you can use R's built-in URL fetching to test if the API is reachable. If this returns an error, check your internet connection and firewall settings.

    response <- readLines("https://api.opencagedata.com/ping")
    print(response)
    # Should return: "pong"

You can also use `oc_forward ` with ` return = "url_only"` to see the exact API request URL being sent:

    oc_forward(placename = "Bordeaux, France", return = "url_only")
    # "https://api.opencagedata.com/geocode/v1/json?q=Bordeaux%2C+France&key=..."

### Alternatives for large datasets

R processes geocoding requests sequentially in a single thread. For large datasets with millions of addresses, this approach may be slow. Consider these alternatives:

- Use our [command line interface (CLI)](https://opencagedata.com/tutorials/geocode-commandline) which supports batch processing with multiple workers.
- For very large datasets, consider using [Python](https://opencagedata.com/tutorials/geocode-in-python) with async processing capabilities.

Before you start geocoding at high volume, please read our [guide to geocoding large datasets](https://opencagedata.com/guides/how-to-geocode-large-datasets) where we explain various strategies and points to consider.

### Alternative R packages

- [tidygeocoder](https://jessecambon.github.io/tidygeocoder/index.html) - a unified high-level interface for geocoding

### Further Reading

- [OpenCage geocoding API Reference](https://opencagedata.com/api)
- [Comparing geocoding services](https://opencagedata.com/guides/how-to-compare-and-test-geocoding-services)
- [Cleaning / formatting your forward geocoding query](https://opencagedata.com/guides/how-to-format-your-geocoding-query)
- [Geocoding more quickly](https://opencagedata.com/guides/how-to-geocode-more-quickly)
- [Geocoding large datasets](https://opencagedata.com/guides/how-to-geocode-large-datasets)
- [Geocoding and preserving privacy](https://opencagedata.com/guides/how-to-preserve-privacy-by-showing-only-an-imprecise-location)
- [Sample address and coordinate lists for testing](https://opencagedata.com/tools/address-lists)

