If Bing Maps and Google can find the Blarney Stone, why geocode with anything else

Submit Ireland’s famous “Blarney Stone” to Bing Maps’ and Google’s geocoding APIs and both will return its map coordinates, rather precisely too. Bing Maps geocodes it’s location to Latitude 51.929092, Longitude -8.570564; and Google to Longitude 51.929019, Latitude -8.570338. That’s a difference of about 14 meters. Neither is accurate enough to kiss it, but both do at least get you to the front door. Try that with a traditional enterprise geocoder and you’ll get nowhere.

Blarney Castle

In fact, the Bing Maps and Google geocoding APIs are really good at returning a map coordinate for a surprisingly varied set of locations, regardless of data quality, and regardless of whether the input is a street address, a place, a geographic feature, or a point of interest. Take for example the following list of locations. If you are familiar with enterprise geocoders, you will recognize immediately which cases will fail in them. Yet the Bing Maps and Google API’s succeed. 

Geocoding Results

Without thinking much about it, many of us refer to the API’s simply as geocoders. For many, the term geocoder is a short-hand for address geocoder. Typically in an organization your goal is to spatially enable a collection of addresses by assigning map coordinates to the addresses. That’s a primary reason why we process billions of addresses in a databases through enterprise address geocoders.

However, the Bing Map and Google methods are much more than address geocoders. The goal with them is to spatially enable web queries of locations. They assume unstructured and somewhat unstandardized location data, unlike what we would commonly find in corporate databases. Witness the examples above, such as the subway station. Without knowing the address, both systems pinpoint it

exactly. So I think its more useful to describe the API’s as gazetteers. Besides simply being a much more expansive characterization, this helps to explain away the things the API’s do not do, as we’ll as some of the seemingly odd and even irrational behavior.

Does the versatility of Bing Maps and Google’s geocoding API’s make them candidates for replacing traditional enterprise address geocoders? Maybe or maybe not. From a coding perspective, its certainly easy enough to slap calls to the API’s in your code (provided you’ve acquired the necessary licenses). Before you jump to it, we think you have to evaluate why you geocode in the first place, and how the API’s fit into your objectives. For the later, we think you need to come it from two directions. First, from the perspective of what the API’s do not do; and then from the perspective of why the APIs exist to do what they do. 

For USA addresses, neither the Bing Maps or Google API’s append Census Codes (e.g., Census Block Codes), County identifiers, or ZIP+4s. Same is true for other countries. Often in a corporate environment, these kinds of map keys are critical. They also do not assign Postal Carrier Routes, do not do Delivery Point Validation, CASS certification, 911 address conversions (e.g., LACS Link) or suite corrections. Nor do they parse addresses into components. If you are used to the big enterprise geocoders, then you likely take all these features for granted.

What They Do Not Do

For USA addresses, neither the Bing Maps or Google API’s append Census Codes (e.g., Census Block Codes), County identifiers, or ZIP+4s. Same is true for other countries. Often in a corporate environment, these kinds of map keys are critical. They also do not assign Postal Carrier Routes, do not do Delivery Point Validation, CASS certification, 911 address conversions (e.g., LACS Link) or suite corrections. Nor do they parse addresses into components. If you are used to the big enterprise geocoders, then you likely take all these features for granted.

While both methods return “formatted” addresses, the output is not really standardized to meet Postal rules. Here are some sample outputs from the US Postal System and both API’s. The inputs, not shown here, were considerably less clean. Bing Maps and Google both neatly format the address in the output, more from a visual perspective however instead of a Postal perspective. At least in this limited sample Bing Maps’ output is closer to a Postal standard. On the other hand, neither API appends the ZIP+4. Again, though, neither tool is meant to provide CASS certification (in the US). 

Address Formatting

Also, the API’s also don’t really do address correction in that they return corrections to bad input. Their goal is to geocode your data, not correct it. Consider the case of 526 E 20TH ST, NYC, NY, 10010, which is one of the examples above. The correct ZIP Code is 10009, per USPS. While Bing Maps corrects it in the output to 10009, Google does not and leaves it as is, while also geocoding it correctly. But then, it’s not Google’s purpose to duplicate USPS. Similarly, neither method explicitly tells you if an address is valid or not, unlike the enterprise geocoders.

What They Do (and very we’ll)

The main reason the Bing Maps and Google geocoding APIs exist is to provide location awareness to search results, mainly to help support advertising. Applications like geo-locating for mapping and routing purposes are important, but ultimately secondary from a money making perspective. That’s why they go to such great lengths to provide a map coordinate of any kind. Submit the address “No.2 Jalan PJU 7/2, Mutiara Damansara 47800 Petaling Jaya Selangor Darul Ehsan” in Malaysia (it’s an IKEA store) to Google’s API and it will return a latitude of 4.210484 and longitude of 101.975766 – with that many decimal points – characterizing it as an “approximate” location. Yet that point is simply the center of the country, and as it happens located in a forest.

If the API’s are fearless at returning map coordinates for almost anything you throw at them, its imperative in your own applications that you judge the quality of the geocodes. That’s especially true if your application is oriented towards address geocoding, where you need reasonably reliable map coordinates for specific addresses. Both API’s provide useful gauges of both accuracy and precision. Accuracy indicators help figure out if the map coordinates are supposed to represent the address itself , or some higher level “fallback” position. In the extreme case, the fall back position could be the centroid of the country, as we found with the example of the IKEA store in Malaysia. Regarding the precision indicators, they attempt to describe how close the map coordinates are to the actual address. 

With traditional enterprise geocoders, accuracy and precision often are captured in highly defined “match codes.” With the API’s the definitions are not only less specific but also the precision indicators have no real meaning except in the context of the accuracy indicators. Plus, the precision indicators have absolutely nothing to do with the number of significant digits. This is why you get odd behaviors like six plus decimal digits in the latitude/longitude values when the precision is described as “approximate” and the accuracy as the center of a “country.” It’s also why an accuracy level of a “stream” together with a precision of “rooftop” has no useful meaning even though it’s easy to output that combination.

With Bing Maps APIs the accuracy indicators are returned through the Entity Type output element. With Google API’s, the equivalent comes through the Address Type output element. Naturally, the two collections are not the same. Bing Maps in fact uses at least 187 values. Yes, that’s right – at least 187 values. Google on the other hand documents 8 values. I emphasize “documented” because it returns undocumented, yet still very useful values. That’s why for example Google returns a subway station value even though it does not appear in the documentation. In fact, both vendors admit the existent of undocumented values.

Here’s a short sample of the accuracy values. I’ll leave the complete list of possible values to the vendors. Click here for the list from Bing Maps and here for Google’s. 

Bing Accuracy Types

For a measure of precision, Bing Maps gives you a data element called Calculation Method. With Google, the same kind of metric is returned via the Location Type. At least with the precision metric, the possible values are reasonably similar. 

Location Types

From a coding perspective, evaluating the usefulness of the geocodes using the combination of accuracy and precision looks daunting. First, the methods deliver a large number of possible accuracy values, and not all of them documented. Of course, if they are not documented, then you cannot code around them. Second, precision is context/accuracy specific, yet neither vendor provides a neat correlation of the two, leaving you to figure it out on your own. Then of course there’s the problem of inconsistent definitions between the vendors. For example, with Bing Maps, “rooftop” means just that but with Google it means a range of possibilities specific to a street address, such as parcel center, rooftop, or street side centroid.

Fortunately, both methods provide a convenient proxy for a combination of accuracy and precision, one that greatly simplifies coding. In addition to the map coordinates of the location, both methods deliver the coordinates of a bounding box enclosing the location. Think of it as a search rectangle, useful for searching around a location for possible advertisers to show in search results. As it happens, it also makes a convenient proxy for quality. Think of it this way. The smaller the search rectangle, the better the quality of the geocode. To use the search rectangle this way, all you have to do is calculate the length of its diagonal – the shorter the length, the more accurate the geocode. To illustrate this point, we ran a few hundred addresses around the world through the API. The table below compares the the average length of the diagonal (in meters) with the accuracy and precision values outputted, this case by Google’s. With Bing Maps you get a similar pattern. While I would not say the test gives a definite guide on how to write the code, it does gives you an idea.  

Accuracy vs Precision

Google and Bing Maps offer really cool and useful geocoding API’s. They are so cool, and so powerful at returning map coordinates for almost anything you through at them that its highly tempting to automatically use of them for your geocoding needs and disregard other perhaps less hip alternatives. Before doing so, you have to first consider why you are geocoding in the first place. Further, implementing the API’s themselves is easy. Understanding and correctly consuming the outputs is where the real effort comes in. 

Author: 
Daniel Brasuk
Share