Exploring Local
Mike Dobson of TeleMapics on Local Search and All Things Geospatial

Google and Map Updating

January 6th, 2010 by MDob

We have previously spent some time discussing how Google compiled its Google-Mapbase and now turn to the how Google intends to update it. Just to make this all flow easier, we will use the following definitions, based on ISO 14825, the International Standard for Intelligent Transport systems – Geographic Data Files (GDF) – Overall data specifications. (It’s expensive to purchase, but if interested you can find more info here).

The information in quotes below is from the ISO GDF Standard.

“Accuracy – Closeness of results of observations, computations or estimates to the true values or the values as accepted as being true.”

“Up-to-dateness – Closeness in time of the geographic data to present reality.”

“Completeness – Extent to which the specified features are present.”

“Ground-truth surrogate – reference source of sufficient completeness and/or accuracy that may be substituted for field verification when measuring the completeness and/or accuracy of map database features, attributes or properties.”

“Attribute – characteristic of a feature which is independent of other features.”

“Feature – database representation of a real world object.”

“Property – combination of attribute and relationship values which pertain to a feature and which together define a certain characteristic of the feature.”

“Relationship – characteristic of a feature involving other features.”

So, let’s get started.

As you know, I ran a test around my neighborhood and found that its representation in the Google-Mapbase was inaccurate, not up-to-date and incomplete compared to its competitor NAVTEQ. It is my opinion that the numerous stories in the news about mistakes in the Google-Mapbase reflect these same inadequacies.

It is hard to be more specific about the limitations of the Google-Mapbase, since I do not have access to the complete Mapbase dataset. My experience in mapping tells me that Google’s map data likely suffers from incompleteness in terms of the features represented across the map coverage they provide and inconsistency in the attribution of map features within that coverage. Google used imagery from satellites and other aerial platforms to create their geometry layer and conflated other data to their geometry in an attempt to create the attribution necessary for cartographic display and the use the database for purposes of navigation and route guidance.

While conflation can be a useful tool, it is one that produces inconsistency, since these data are collected from variety of sources whose data accuracy specifications and data collection needs (attributes) reflect their objectives in creating the data, but not necessarily those of Google. In effect, the Google-Maps base might look like the distribution of feathers inside of a down comforter, with data globbed-up (it’s a technical term) in some areas and spread very thin in other areas. In addition to the problems of data compatibility exposed by conflation, we need to acknowledge that the sources of the data conflated by Google are spatially disparate, do not produce a uniform quality of data and likely update their data on variable schedules. In other words, I expected the initial Google-Mapbase to be non-homogeneous in terms of accuracy and completeness and it appears to have met this expectation. I am sure that this was what Google expected, although they might not have realized that the initial data quality would be as low as it has been reported.

So, what Google needs to do is to find a method to update their data that will remediate these problems and provide them a better mousetrap than that used by NAVTEQ and TeleAtlas Some would say that they will, also, need to have a better product than Open Street Map. Hmmm.

In order to appreciate Google’s approach to updating the Google-Mapbase, we need to think about the strategy that Google is trying to optimize by creating their own mapping and navigation data. My belief, as stated previously in this blog, is that Google believes that it requires more accurate map and address information in order to be able to deliver customers to its partners who advertise their business or service through the Google Network using the Google AdWords product. In addition, Google is interested in improving the quality of its map database and navigation tools in order to be able to deliver relevant, spatially targeted advertising to people using mobile phones. Google’s Automotive Division will try to expand this sphere of influence to in-car communication systems, while the Google Mothership tries to use these same map and navigation data to geotarget advertising to online users of its browser and other cloud-based products.

These use requirements indicate that the Google-Mapbase needs to include all transportation arteries (higways, county road, streets, etc) mapped correctly (position, classification etc.) and all of the address information describing the location of houses and business along the street face attributed correctly (street name, street addresses, address position, street geography – which neighborhood, city, county, state, country, etc.), including additional attributes defining the businesses along any street segment in terms of name, contact information, business classification and other identifying variables.

Note, that Google’s focus on navigation and route guidance means that it is playing in a different ballpark than OSM. At the present, OSM seems comfortable producing cartographic data for map presentation, but does not compile all of the routing attributes needed to create an industrial quality navigation database capable of providing route guidance. In turn, there is reason to suspect that, at present, the weakest link in the Google-Mapbase is the lack of significant detail describing features and transportation artery attributes that are required to successfully navigate an active vehicle along a path from origin to a destination.

Well, let’s take a look at standard updating practices and see what Google might do to save the day.

Many mapping companies that create purpose built geographical databases (rather than simply re-use those created by others), collect the first compilation of the data based on a clear and unambiguous set of specifications for data accuracy, up-to-dateness and completeness. Data that do not meet the compilation specification are re-collected or prioritized and placed in the update queue depending on the nature of the problem their inaccuracy presents. In addition, companies involved in data collection canvas the world to see if there are any existing data sets that might be adapted to complement their data collection efforts.

While many organizations collect spatial data on a special purpose basis, it is unlikely that any of these entities will collect data that mirrors the specification being used by the company determined to create a navigable database. It is relatively easy to find cartographic data that will allow you to put together a database for purposes of simple mapping, but entirely a different issue to collect spatial data that is attributed in a manner that will allow navigation and route guidance.

Most programs to update the data of critical importance in cartographic and navigation-quality databases are geared to improving data quality based on a practice of systematically working through or “touching” the entire map coverage over some reasonable period of time. Often, this process is based on fully updating large geographical extents each year, progressing towards complete revision of the map database during a three of four year cycle. The effort we are describing here is known as a “planned update cycle.”

Overlaid on top of this stepwise, comprehensive update is an effort that implements various “change detection” measures to prioritize map data changes that dramatically impact product quality, especially in fast growing urban areas. User feedback (usually complaints) form an important part of this activity, but the greatest asset here is the use of current imagery from an aerial platform that can quickly show the synchronization between the imagery and the map database views of the real world.

The next conceptual overlay in the update process is known as “opportunistic updates” that take advantage of the discovery of new or updated collateral data sources that can be used to enhance a “planned update” cycle. Most commonly, this activity results from the discovery of a new data source that operates on a government level, such as a city or a county that has created a GIS databases for planning or development purposes. These data are evaluated for fitness of use and, if qualified, are imported or conflated to a master database that normally contains accurate geometric representation of transportation features in a specific geographic area, but may lack the attribute data as accurate or comprehensive as that available in the new data source.

Companies involved in updating databases used for navigation and route guidance actively work to harmonize these data and success in managing this task often distinguishes the players from the pretenders in the map database world. Harmonization is the database builders attempt of manage data quality to meet specific requirements to qualify the data as fit for specific uses, in this case mapping, navigation and route guidance. Companies that actively work to harmonize their data are attempting to produce “controlled data” that is generally verified by some method of field observation. Geographic data that has been compiled based on external specifications is considered uncontrolled data and its inclusion is a process that results in a hybrid database whose accuracy and harmonization is considered less reliable than that of a controlled database.

NAVTEQ provides the purest, most controlled map and navigation database. TeleAtlas, since it acquisition by TomTom, sits firmly in the hybrid camp, relying on MapShare and probe data for a significant amount of its data validation. Google also falls in the “hybrid” category, but is this category by design and intent. The question that we need to pursue looks like this “Is it possible to manage the data quality of hybrid databases for mapping, navigation and route guidance to meet specific accuracy requirements?”

It is my belief that Google looked at the map data quality provided by its old suppliers NAVTEQ and TeleAtlas and decided that these data did not meet the specific accuracy requirements required by Google for success in its markets. My sources tell me that Google believed that the geographical data supplied to it were out of date, inaccurate, incomplete and inconsistent.

Before NAVTEQ was acquired by Nokia, I analyzed the financial reporting information that they were required to provide as a registered public company, so that I could determine, to the extent possible, their expenditures related to updating their map and navigation database. My calculation was that NAVTEQ may have spent over $300,000,000 in 2007 on updates and database extensions. I was not able to specifically nail down costs, as the expenditures reported in the public documents were not categorized in the same manner over the reporting periods I examined. If my analysis of the NAVTEQ financial data is only directionally correct, it is clear that spending lots of moolah may not be an effective way to improve the quality of map updating.

It is my belief that Google came to believe that the solution to improving the quality of map databases was more related to method than to spending. Further, the method that Google is betting on to produce this improved quality is “collective intelligence”. Yep, Google is betting on the Borg. What is even worse, is that they may be making the right bet.

Well, let’s look at the role of collective intelligence and map updating in the next borg – I mean blog. Just to get you ready for next time, look at this modest update of the map database production cycle. Google is determined to turn this on its head, but supplementing what they are doing with traditional approaches, just in case. Tune in next time for the real expose.

Generalized map database and update model

To see a larger version of the image, click here.

We will provide a detailed, but very different model next time.

Click for contact Information for TeleMapics

Bookmark and Share

Posted in Authority and mapping, Data Sources, Google, map updating, Mapping, Mike Dobson, Navteq, routing and navigation, TeleAtlas


(comments are closed).