Exploring Local
Mike Dobson of TeleMapics on Local Search and All Things Geospatial

Google Maps Stumbles Badly – Crowdsourcing is the Problem*

May 25th, 2015 by admin

Google Maps has had a rough go of it lately. Public relations problems generated by crowdsourced data are at the heart of the conundrum, but the problems are related to two different systems used to support Google’s mapping efforts.

Google Maps’ current public-oriented problems are these:

1) The editing system the company employs for using crowdsourced data that may eventually appear on Google Maps is not authoritative.

2) The local search “folksonomy-oriented” matching algorithm used to match names users enter to find locations on Google Maps was poorly designed.

Both of these “gotchas” are unfortunate and could have been avoided. I, and others, have offered plentiful, free-advice to Google about the company’s need to tune its spatial data capture to enhance its map data base, not to detract from it. Let’s look at the specifics of Google’s latest mapping problems.

Map Maker

In regards to Map Maker, the public relations fiasco focused on Google Maps apparently ingesting a curated edit of a road network whose content outlined an Android-like figure urinating on the logo of its competitor Apple. Indeed, it was reported that just to the east of the peeing Android was a sad face emoticon with the text, “Google Review Policy is Crap.” (See this source to view both images.) As an aside, it is good to see that Google has kept its sense of humor. While I was searching for sources on the “peeing Android,” the ad on Google’s search page was titled, “Manage Overactive Bladder.”

Of course, these have not been the only errors discovered in the company’s use of crowdsourced data – I am sure many of you remember the listing for Edward’s Snow Den located in White House. How could these types of map spam have been unexpected by Google? And yet, according to the Venture Beat news source cited above, Google’s response included this interesting comment, “We also learn from these issues, and we’re constantly improving how we detect, prevent and handle bad edits.” Hmmm. I never would have classified the Google Maps Team as slow learners. They are world-class brainiacs. Maybe they lack an appreciation of or familiarity with the nuances of cartographic practice?

In any event, in 2011 for example, I wrote a series of blogs analyzing Google’s Map Maker System and the company’s handling of crowdsourced data ( e.g. here and here, among other articles on the topic). To save you from having to read the original articles here is a concise summary – I examined Map Maker and its editing system and found that due to flaws in the system as it existed at that time the edited and “validated” information in Map Maker resulting from user generated data should not be considered “authoritative.”

Currently, Google has suspended Map Maker edits and is working on a solution to the “problem” of users contributing invalid, inappropriate, or otherwise erroneous spatial data for use in Google Maps. Let’s talk about what Google might consider doing to solve this problem near the end of the blog.

Google Map Search

Google Maps latest problem, highly documented in the press here, here, here, and here, is that it attempted to match unconstrained location identifiers (an uncontrolled vocabulary) entered by users during map search with actual locations on Google Maps (a controlled vocabulary). More specifically, the company chose to employ a purpose-built approach based on the use of an unconstrained folksonomy to match possible surrogate names entered by users during map search queries to find actual names and locations of the POIs (points-of-interest) symbolized on Google Maps.

I fully support the notion of a folksonomy-based approach to local search. As a matter of fact, in 2007, before Google or anyone else in mapping or location search was using the concept, I wrote a blog titled “Controlled Vocabularies, Why local search needs folksonomies.”

Google apparently understood the concept, but was not thorough enough in its implementation.

According to Google Maps’ own blog, the Google Team culled spatial terms from online discussion forums and related these names to known geographical locations. In some cases, the terms they gathered were found to be “offensive.” Really? How unexpected was this obvious, method-induced error? Did they think that they might not find associations between names and places that might be offensive? Have they never read about the riled-up public opinions on naming decisions made by the Board of Geographic Names of the United States? Nevertheless, Google authorities stated that they, “…were deeply upset by this issue, and we are fixing it now.” Hmmm. What other time bombs are yet to be found? Has Google not yet learned that maps and spatial information cannot be handled or considered “…just another information system?”

Maps and the information that they contain will bite you in the ass when you least expect it. My experience comes from years of teaching map making and over a decade spent as the person in charge of all mapping operations at a company that was, at the time, the world’s leading print publisher of maps and atlases. My mantra each morning was, “What’s it going to be today?” Google may be beginning to appreciate the problems of compiling accurate maps, evaluating map data for timeliness and appropriateness, calibrating authoritative editing systems, all while keeping your product up-to-date and editorially acceptable to your user base (it’s that old geographic names thing again).

Conclusions

Problems with their approach to crowdsourcing are at the heart of Google’s current, public, mapping blunders.

Surowiecki in his important work “The Wisdom of Crowds” provided a comprehensive look at user generated content and I urge you to read his book. Surowiecki postulated that taking advantage of the wisdom of crowds depends on the diversity of opinion, independence and decentralization in the crowdsourcing population, as well as the influence of the method used for soliciting contributions. Surowiecki felt that if the crowd contributing data cannot satisfy these conditions, then its judgements are unlikely to be accurate. If he is right, then Google may need to rethink its approach to crowdsourcing data for use in Google Maps, as it appears to me that its current procedures violate almost every aspect of these cautions.

In part, Google’s use of crowdsourced data seems to reflect a belief that the company would have been unable to create as comprehensive a map database on its own as it has been able to create using crowdsourcing. Google rightly reasoned that contributors to its spatial database might not have the same goals as Google in regards to map accuracy and authority. Presumably, it is for that reason that Google evolved a hybrid-edit practice, but then negated the efficacy of the system.

First, it employed internal editors who did not possess the specific local geographic knowledge to assess crowdsourced contributions supposedly describing local geography. Second, it further diluted its goals for the system by the manner in which it allowed its contributors to become one of the components of the authoritativeness of its edit system. In the long run, Google needs to find a way to exert control and authority over its edit system. Until it does, blunders like those described above, and ones that are even worse, will plague their map database.

Google’s goals for crowdsourced data often appear contradictory. While they want to be able to harness local knowledge from users, their system allows users to contribute to the system even when they do not have local knowledge, nor are located in the region for which changes are being contributed. Map Maker is a prime example of this mismatch. In turn, some review editors, also, appear not to have the local knowledge that one would think was required to analyze a contributed change made to some aspect of a “local” geography. Using imagery is an understandable, but poor substitute for local knowledge.

In other crowdsourced mapping systems edited data are pushed to a live site and, then, curated until it is “considered” correct (kind of like a ping pong match) by meeting the commonly held notion of what is correct by the community that evaluates it. Data in crowdsourced systems are supposed to be “self-healing” over time. Google, apparently, instituted its editorial review measures because they could not afford for live data to be batted back and forth until judged to be “healed.” For example, it is difficult to design a mapping system or a routing system whose features might be in a constant state of flux. Not only could this create incorrect maps, but non-navigable routes.

Google seems to have designed a system that that did not take the “extended” healing path, but one that was just good enough when its product was at a lower profile. Unfortunately, the system is no longer appropriate for the uses to which it is being put. Could these active sources of user generated content be used to navigate autonomous cars? We had better hope Google figures out a fix before that happens.

In regards to the map search problem, Google apparently was aggregating input from people who, presumably, were unaware of Google’s use of these data. While Google seems free to aggregate any information it wants, it boggles the mind that it would do so based on chat room conversations which were certainly not authoritative sources of information on local geography. Creating a folksonomy without consideration of the source authority, or the use of a filter for “appropriateness” were major, bush-league blunders. In addition, gathering crowdsourced data is influenced, see above, by the method used to solicit information from the targeted population. Google now knows that its method is in error, but will it be able to concoct a user-focused paradigm that elicits data accurate and useful enough for the purposes of Google Maps?

Whether or not Google can find a way to effectively engineer and police crowd-sourced systems is a topic of interest for them (and for me). My own opinion is that active and passive crowdsourced systems will be critical components in all future mapping systems. Google has the resources to monitor, evaluate, rank and adjust or regulate its crowdsourced geographical data to achieve its goals in mapping, but seems reluctant, or unable to mount the specific effort required to confront the problem.

As, I have noted here in past blogs, Google engineers don’t necessarily think they are smarter than everyone else, just that they have more and better data with which to examine a problem. Google should have the smarts, resources and the required data in their data lake/reservoir/swamp to analyze the likely validity and usability of crowdsourced map data by creating a consistent, authoritative vetting process. But, maybe not. Or maybe the effort would make it uneconomical.

Well, maybe Verizon will come up with something now that it owns the almost moribund MapQuest. Apple Maps? Well, they could certainly take better advantage of crowdsourced map data, but that does not seem to be of particular interest to them at this time. Although it is a technique that could really help them improve the quality of their maps and, especially, their business listing data.

And now for something completely different

While I spend most of my time on assignment for my consulting business or thinking about the problems of mapping and spatial data handling when not on assignment, I do find time for one hobby in particular. Don’t laugh – it’s bird photography. If you are interested in the world of shore birds you might want to take a look at some of my photos – DobsonPhotoArts.com . (While prints are for sale, I do not expect you, the audience of this blog, to buy any images; their purchase is part of a more complex strategy – and fun research in its own right. So don’t alter my sample.)

I hope you had a great Memorial Day Weekend.

Dr. Mike

The reference to the Surowiecki work is as follows:

Surowiecki, James, [2005). The Wisdom of Crowds: Why the many are smarter than the few and how collective wisdom shapes business, economies, societies and nations, New York. NY: Anchor Books Edition pp 306

*Blog edited on 2015_05_26 to improve readability.

Bookmark and Share

Posted in Apple, Authority and mapping, business listings, crowdsourced map data, Data Sources, folksonomy, Google, Google Map Maker, google map updates, Google maps, map compilation, map updating, Mapping, MapQuest, routing and navigation, User Generated Content, Volunteered Geographic Information

2 Responses

  1. Jim LeClair

    Excellent read… the publishing of crowd sourced info in any map system requires the business owner to enter this game of “ping-pong” to match information that is often already in the native map set and validated by the business owner. Bings publishing of Yelp data, then not removing incorrect data points once identified is absurd. Most map data sets seem still concerned with quantity over quality. We are watching closely as the bidding for (nokia) here maps will add yet another player into the mapping game. Jim

    Hi, Jim:

    Thanks for taking the time to publish your insightful comment. I agree with you that the game is about to get more interesting. Or, some might say, “…curiouser and curiouser.”

    Thanks again,

    Dr. Mike

  2. Geoff Dutton

    Hi Mike,
    Here’s a more pernicious problem with crowd-sourcing maps that someone felt compelled to petition Google to quickly correct:
    https://www.change.org/p/google-inc-remove-maps-to-secret-domestic-violence-shelters

    The geospatial Web is a vast playground for stalkers. Maybe you would like to write about that.

    About 15 months ago I wrote up my take on a startup called connect.com, whose mission is to fashion the geospatial web and social media into a vast playground for marketers and snoops of all sorts.
    http://cowbird.com/story/89143/The_Creep_Factor/

    Thanks for your insights

    Hi, Geoff:
    Thanks for your comment. I apologize, I missed the comment and it has been in the cue since May. Am I embarrassed.

    Dr. Mike