Exploring Local
Mike Dobson of TeleMapics on Local Search and All Things Geospatial

Better Maps Through Local Thinking

July 9th, 2010 by MDob

I apologize for the delay in posts. A couple of months ago I had to move Exploring Local to a new platform and the support of the service provider has been less than stellar. In fact, I am not sure this blog will publish, since I tried it yesterday and it broke the system. I hope it works this time.
Although I have PhD in Geography, my graduate work was focused on cartography. For many years, I thought I knew something about maps, map databases, GIS, navigation and a host of related topics. Indeed, I was so sure of myself that I taught undergraduate and graduate level courses at the State University of New York at Albany espousing my views on maps and mapping. Having grown even more certain that what I had learned about maps was true, I accepted a position at Rand McNally & Company and was their Chief Cartographer for thirteen years. Eventually, I decided to go solo and opened TeleMapics, my consultancy. I have spent most of the last decade working with clients who are interested in using spatial data for maps, navigation, LBS and a variety of practical applications.

What I learned about map databases during my career is that making them is a data driven process. Without accurate data, your application cannot communicate the information that will resolve the spatial question that had caused the user to refer to your database in the first place. While a large number of top-down processes are applied to make map data useful, such as simplification, symbolization, classification and other aspects of a process that cartographers call map generalization, all of these manipulations, in the end, are constrained by the quality of map data.

From my perspective, creating maps and map databases are examples of a data driven or a bottom-up process. From a logical point of view, if the data do not allow an inference, it would be incorrect to assume one. For instance the road overlay on the Google map of Providence, does not match their imagery. At some point in his travels the driver must enter into an ASM (Arnold Schwarzenegger Maneuver), accelerating the car to 162.615 miles per hour to leap the intervening quarter of mile where the imagery shows no connecting road.

Maps and reality can vary - trust reality.

Yep, I know, I am being too harsh. After all, somebody is going to write the mother of all algorithms that will pick, pack, prep, collect, coordinate, conform, mish, mash and manipulate all of these diverse data sources into a cohesive, spatially accurate map database. Sorry people, but it’s not going to happen. There are fundamental data collection problems in the world of spatial data that still are not resolved. For example, where can you find a definitive, authoritative, up-to-date, comprehensive database of all occupied residences in the United States? While the US Bureau of the Census might have one, they can’t give it to you due to Title 13 restrictions on privacy. Hmmm. Is there an alternative? If you suggested the USPS, you need to do some homework. Do you know that even if you could find this database publicly available (and you cannot), you could not find the map data on which you could display all of the streets for all of the addresses contained in the data.

Apparently, it doesn’t seem to bother consumers that the map databases provided by the major map database producers are incomplete, out of date and otherwise erroneous. Maybe if you just keep slathering on the eye-candy everyone will forget that you can’t use these maps to get from here to there and sometimes you can’t even find “here”.

Conversely, map producers know that map database creation and updating is a game of Whack-A-Mole that they can never win. Even utilizing crowdsourcing, data collection vans instrumented with inertial systems, lidar and who knows what else, they have been unable to keep their databases accurate or up-to-date in a uniform and comprehensive manner. In fact, there may not be enough money in the world to create an up-to-date, accurate and comprehensive street map of, say, the United States. Of course, this raise the question of what accuracy and comprehensiveness actually mean in the world of mapping. From a practical perspective the answer is unappealing, but since you asked, the “commercial” answer is, “What’s good enough.”

Really? Yep. Commercial mapmakers don’t actually know how good or how bad their data are at any point in time, because they have no effective way to test their completed database. Yes, they have quality control and quality assurance procedures and yes, they may be ISO certified, but that does not stop them from distributing data that are clearly erroneous.

It is sad that modern routing engines and the databases they use are prime examples of this contention. I have read numerous articles where the reporter has indicated that they previously used Google for their mapping and routing needs, but stopped doing so when Google decided to become the poster child for bad map-making practices. However, I am sure that all of these folks would tell you that they are not satisfied with the mapping and routing service they presently use, but that this is the one that gets them where they need to go more often than the others they have used.

In fact, the notion of the incompetence of the routes that you get from your online or mobile provider has become a glamorous component of the “war stories” most travelers unroll during an unexpected flight delay. Have you heard this one – “So it told me to take a right out of the driveway, took me through 23 other maneuvers (for some reason it’s never an even number) and eventually navigated me to a place within a block of where I started, although the route covered two miles.” Been there, heard that.

Well, then fixing these errors must be an important task and a priority issue for the companies involved. So, what exotic technologies do these companies use to find and fix the data that are erroneous? The embarrassing truth is that the best indicator of update priorities is customers who tell them what’s wrong with the data and these companies don’t even have to ask. What a great business model, huh?

Conversely, If no one complains about the map data quality, there is little chance that bad data gets fixed. It is sort of like the map database version of “Don’t Ask, Don’t Tell”. Common sources of error correction for customer input includes email, phone calls, customer sales calls and the more modern correction systems like the online map error reporting websites of NAVTEQ or Tele Atlas and the crowd-sourced error correction applications like TomTom’s MapShare. Yep, no knowledgeable humans on the map database vendor side of the equation – only a simple “Just the facts, Ma’am”, Jack Webb type of interrogation.

But wait, let’s get back to that “What’s good enough” issue. What does that mean? Well, it means that the map data are “good enough” to satisfy the needs of the companies willing to license the data. Yes, the clients will complain about errors that are found by their internal product development teams or convey the errors that are sent to them by their customers, but, by and large, the customers of the leading map database providers are a captive audience. If you need a national database to assist in creating your spatially based services, “who ya gonna call,” other than NAVTEQ, Tele Atlas or someone who packages their data?

How can this be?

In the big leagues, map updating is carried out as a two-tier process. Customer complaints drive topical updating where egregious errors are prioritized, researched and corrected. In addition, mapping companies systematically review their entire database on a more or less set schedule, but some areas of the database may not be “touched” for years (if they are rural, remote and unpopulated) while other major urban areas may receive attention that is more frequent.

The word “review” is the fly in the ointment here. The reality is that map databases are updated by comparing them to sources (including field observation) thought to be more current or more authoritative than what you already have in your database. If found to be more current, reliable or accurate than what is in the database, the new data (depending on the type and authority of the source) are either added to the database or used to focus further research for updating. If the sources are not better than what you have, they are tossed. When you have collected what you believe to be the best data for a specific location, you are not going to look at this area again until someone complains, or you discover what may be a more up-to-data or authoritative source during your systematic update process. It is for this reason that map databases contain numerous errors waiting to be discovered by unsuspecting users.

The most valid source for gathering street map data is reality or some method of memorializing its complete details, at least the ones of relevance to mapping (as Google does with its Street View service). Compiling from existing maps or spatial databases means that you have bought into the method induced compilation errors inherent in the procedures used by whoever produced the data you are examining. Unfortunately, you are usually not going to discover the competency of the data or the data gatherers from metadata. So, what’s a body to do?

Crowdsourcing is one response. Although I am a great proponent of crowd sourcing and believe it to be one of the most promising methods of creating up-to-date maps, I am not sure that it will produce a reliable, accurate, comprehensive seamless street level database, over an extent as large as, say, the United States, or maybe even a small region such as the Delmarva Peninsula.

While the OSM product might be of sufficient quality in a number of cities, it is less likely to be of uniform quality across large physical extents with variable population densities. In addition, I suspect that it might not be of very good quality in areas of low income, high crime and other socioeconomic attributes that would convince many data gatherers to avoid these locations. Yes, OSM does use public data sources when available, but here again, you face the issue of adopting potentially erroneous procedures that plagued the original data compilation process.

My greatest concern about active crowdsourcing (active participation – not probes which I regard as passive participation) is that I am not sure it is sustainable over long periods. I realize that its present supporters might be willing to dedicate their time to this effort, but what happens in five or ten years? Will willing replacement data gatherers be found or will OSM become a collection of floating point data sources? Alternatively, might OSM collapse from neglect? Or, might OSM become a series of local data collection tribes. Hmmm. NAVTEQ and Tele Atlas have been around for approximately twenty-five years (since the mid-1980s); will we be able to say the same thing of OSM in 2030?

I have to admit a bias here. I like the concept of crowdsourcing, but think that in the long run it will prove inferior to the ability of for profit firms to sustain quality driven, map database updating using traditional field, research and crowdsourcing techniques (as Google does today, but inefficiently). On the other hand, I am not sure that international map database producers like NAVTEQ and TeleAtlas can compete in local markets with local companies or for-profit groups that are prepared to compile a map based on local sources.

Let’s talk about how that may happen next time. In order to do so , we will need to discuss why many important markets for map data may be, inherently, local.

I will be at the ESRI UC next week and will let you know if I see anything interesting.

Click for our contact information

Bookmark and Share

Posted in Authority and mapping, Google maps, map compilation, map updating, Mike Dobson, Navteq, routing and navigation, Tele Atlas, User Generated Content

One Response

  1. Jim D

    Google had it right for awhile. I can’t understand how they added a
    road that does not exist there.

    Navteq has it right except on OVI maps which is 50% correct.

    TeleAtlas still at 50% correct. Its hard to believe that the hundreds of thousands
    GPS traces TomTom has uploaded through Home are just ignored.

    Thanks Jim. It does get “Curiouser and Curiouser” in the world of mapping.