Formatting Addresses

The problem

The way addresses are formatted varies from country to country. Often digital services originally developed for one market will make assumptions about how addresses look and apply them to another market, a subtle internationalization (i18n) error. This can confuse or alientate users.

The "Berlin, Berlin" Example

In the United States it is common to display addresses in the format city, state abbreviation for example: Denver, CO, USA for the city Denver in the state of Colorado.

Often when American services launch in Germany they try to follow this pattern, always showing city, state

The problem, though, is that in Germany we have a few major cities, like Berlin, that are also states. Here's screenshot from LinkedIn where they refer to a user's location as Berlin, Berlin, Germany

Berlin. Berlin example from LinkedIn

This is not technically incorrect, Berlin is indeed both a city and a state. But it is wrong in terms of the user expectation. It makes no sense, and immediately it shows to a German consumer that this service is not really built with Germany in mind.

How we solve the address formatting problem

In each geocoding result you will find a section of information called the components Here for example are the components we return for a request to reverse geocode the coordinates 52.3877830, 9.7334394 (the coordinates of the OpenCage office in Hanover, Germany).

"components" : {
    "ISO_3166-1_alpha-2" : "DE",
    "ISO_3166-1_alpha-3" : "DEU",
    "ISO_3166-2" : [
       "DE-NI"
    ],
    "_category" : "building",
    "_normalized_city" : "Hanover",
    "_type" : "building",
    "city" : "Hanover",
    "city_district" : "Vahrenwald-List",
    "continent" : "Europe",
    "country" : "Germany",
    "country_code" : "de",
    "county" : "Region Hannover",
    "house_number" : "2",
    "office" : "Design Offices",
    "political_union" : "European Union",
    "postcode" : "30165",
    "road" : "Philipsbornstra\u00dfe",
    "state" : "Lower Saxony",
    "state_code" : "NI",
    "suburb" : "Vahrenwald"
},

This is all valid information about the location. But it can be overwhelming. Which of those pieces should a developer use to show a user the address of the location?

Luckily there is no need to guess, we take care of that and provide a formatted string that uses the relevant pieces of the components and presents them in the correct order for that geography:

"formatted": "Design Offices, Philipsbornstraße 2, 30165 Hanover, Germany",

Note that by default the name of the _type (in this case "Design Offices") is specified in the formatted value. This can be turned off via the optional address_only parameter in which case the formatted portion of the response is simply:

"formatted": "Philipsbornstraße 2, 30165 Hanover, Germany",

More examples

You might think this is a relatively minor problem, but it becomes much more complex when you are building a service managing addresses in multiple countries. Have a look at these addresses

'au': '223 William Street, Melbourne VIC 3000, Australia'
'de': 'Rosenthaler Straße 1, 10119 Berlin, Germany'
'es': 'Carrer de Calatrava, 68, 08017 Barcelona, Spain'
'gb': '115 New Cavendish Street, London W1T 5DU, United Kingdom'
'it': 'Via Canosa 92, 76121 Barletta BT, Italy'
'za': '3 Upper Alma Road, Rosebank, Cape Town, 7700, South Africa'

In six different countries we have six different formats! Painful.

Shorter please

There may be times space for displaying a location is limited. We provide an optional parameter %code abbrv when set we attempt to abbreviate the formatted string. For example "United States of America" becomes "USA". The templates we use for the abbreviations can also be found in the address-formatting repository on GitHub.

For more details see the API documentation of the abbrv parameter.

Our open source address templating project

The rules we use to do the formatting from the data available to us in the components are all open sourced in our address-formatting project on GitHub, along with hundreds of tests. The templates are independent of any programming language. We maintain Geo::Address::Formatter, a parser in Perl, but parsers have now also been written in many other languages as well. The templates are actively maintained and we welcome all contributions, especially tests for edge cases where we can improve.

Further Resources

Happy geocoding!

Start your free trial

2,500 geocoding API requests per day.

No credit card required.