Exploring Local
Mike Dobson of TeleMapics on Local Search and All Things Geospatial

Google, Navteq and Map Compilation

July 6th, 2011 by MDob

And Then There Was…..One?

For this blog I had intended to write an in-depth analysis of efforts of Google and Navteq related to the map compilation strategies they use to produce navigation quality map databases. However, I have decided to focus on the major differences in their approaches, namely field observation and the use of crowdsourcing, since these should be the deciding factors in determining which company is able to produce maps with the desired coverage, currency and accuracy. Unfortunately, there are a couple of additional factor that may have more bearing on the outcome of this competition than any of the technical issues. Of course, I have learned something from journalists and novelists over the years and will conclude this blog by integrating these mitigating factors into my thoughts on which company will dominate the world of maps, mapping and, perhaps, spatial data.

Fans of OSM might be disappointed to learn that I do not consider OSM a contender for this crown, although I do think the organization has a bright future, but perhaps one that is not as map-centric as it is today. Fans of TomTom will wonder why I have not included Tele Atlas as participant in the “Map Champion of the World” competition. Well, it is my opinion that TomTom has all of the components to be a great competitor, but lacks the financial ability to implement its map compilation strategy in a comprehensive, robust manner.

Google and Navteq

Those of you who have paged through the PowerPoint in my last blog will have noted that I am convinced that the winners in the world of map compilation will be those who wield a hybrid approach to the subject. The hybrid compilation approach that I envision melds 1) traditional compilation techniques (e.g. field work/field observation, data gathering/data mining, use of imagery, conflation, data editing, data QC/QA) as practiced by staff with a professional training in mapping/GIS, or by staff who have received training in map compilation techniques with 2) crowdsourced gathering of map data. Make no mistake, crowdsourced map data is a required ingredient for success, just as much as is the use of established map compilation techniques.

How do Google and Navteq differ on the two factors of greatest importance?

The field work advantage goes to Navteq.

Field observation can be structured in several ways, but most often the strategies result in either field work that is sometimes directed in a top-down or conceptually driven manner, while at other times it is directed using a bottom-up or data driven strategy. My belief is that the best map compilation programs must mix a spoonful of each approach to create the winning elixir. In other words, map compilers need to have a conceptually driven database update program that reflects their best guess about areas covered by their map base that are: of strategic importance to leading customers, areas of unusually fast growth/economic development, or areas of significant change. In addition, most “players” in the map compilation business have an established program to “refresh” all coverages in their database on a cyclical basis, based on the notion that data change over time, new sources evolve and that all areas need to be reviewed for changes on a known cycle (although these cycles may differ by geography).

Conversely, data compiled in a previous field canvass can be riddled with errors (either in the original map compilation or with data that have changed since the last compilation). Most often these errors are discovered by map users or the customers of the map database provider and this is an example of data driven compilation. If the area (such as the boondoggle on I-195 in Providence, Rhode Island) is important enough to customers and users, a field team will usually be deployed to unravel the ground truth if other methods of collecting and correcting the data cannot be accomplished through the use of surrogate data.

I think it is an underappreciated fact that having customers with a vested interest in the accuracy of a company’s map database is an important part of the success of map compilation efforts. Unsatisfied customers have leverage in the map compilation process as they are extremely unlikely to accept blame from their users on a map error that is not their fault. The can and often do wave their contractual agreement at the map compilation company to help the data providers understand the “need” to update maps in areas the customer feels included particularly egregious errors.

While “most” companies listen to their customers and the users of their data, customers, especially ones who contribute significantly to revenue streams, command attention. The more important the customer, say BMW, the likelier it is that the offending change will be corrected sooner rather than later and if there is an offending area with numerous compilation errors, the more likely it is that a field team deployed or tactical field representatives contacted and deployed to determine the actual condition on the ground.

Smaller customers command less attention and mere users of products like you and me are, often, even further down the “sensitivity” chain. The reason that I make this distinction is the Navteq has a number of customer accounts that are extremely important to them and who can snap the compilation whip when they are unhappy with the current state of the Navteq database. Google, on the other hand, while having has customers, only indirectly provides support to them (e.g. by the location of their business as a POI or as a map of the location in a Google “local search”). Google’s map compilation system is really a data driven system led by users rather than customers. In other words, the impetus for Google to make changes to their maps is often public embarrassment about their map gaffes highlighted by users, rather than by the direct requirements of customers who are deploying map database in an attempt to solve a specific problem such as in-car navigation.

I think this distinction is important because it is likely that many map errors never get communicated to Google by its users. For instance, how many times have you corrected a Google Map? Of the errors that do get communicated to Google, some users attempt to use Google Map Maker to correct those errors, and the corrections are then vetted by Google’s crack editing team based in India, who, of course, have a complete lack of local knowledge on which to evaluate these proposed changes. The Google error reporting process is one that is unlikely to generate a field response. Yes, the Google Street View vehicles do operate on a top-down deployment, but it is unlikely that they are deployed as a data driven response mechanism.

While electronic and image sensing activities are conducted by both Google and Navteq, the Navteq field teams and contacts are prepped with specific objectives and provided with task lists before they enter the field and human observation of spatial characteristics is often a significant part of their activity. If Google cannot sense these data, find them through data mining or imagery classification, it is unlikely that they will ever discover the richness of data that Navteq operators are able to observe in the field. In essence, if Google cannot find the solution using an algorithm, it needs to be found through crowdsourcing or not at all.

We could spend more time discussing why Navteq’s field operations return better data accuracy than those of Google, but that would be kicking a dead horse –something for which I am known, but do not feel like doing today.

The crowdsourcing advantage goes to Google.

Oh stop! Navteq can tell me a million times how much crowdsourced data it now receives and that still will not convince me of the benefits that this is bringing the company.

There are two types of crowdsourced data. Active data is contributed by a user when they find an error or omission on the map that they willing to fix, or at least to contribute information on what they observed on the ground, as opposed to what they saw on the mapped database. Passive crowdsourcing is the use of GPS signals from PNDs, car navigation units, Smart Phones or other devices equipped with GPS or capable of tracking other RF (such as Wi-Fi). While a significant amount of data can be extracted from paths, it is mostly related to geometry, position, data flows (for traffic analysis) and the like. Passive data does not provide attributes such as names, addresses, zip codes, contact numbers, roadside furniture (see, I told you once that you needed to read that GDF manual), and other important information. The vast majority of the crowdsourced data received by Navteq and incorporated into its databases is passive. Navteq has very limited inputs of active crowdsourced data and gets very little benefit from the real-world knowledge held by its users and the users of its customers’ products and services.

Google, on the other hand, receives significant amounts of both active and passive crowdsourced data. Its active crowdsourcing through map corrections and Google Map Maker corrections provides the company an enormous benefit, although it has not yet managed to create a system that provides them all the benefits that active crowdsourcing can supply. However, my purpose here is not to expound on a better system, but merely to point out that the advantage in the use of crowdsourcing clearly benefits Google Maps, while Navteq’s efforts in active crowdsourcing are too limited to provide any significant benefit to Navteq, other than these data may be used as change detection beacons to help target their field activities.

Crowdsourcing operations are relatively inexpensive to create and maintain. The primary factor is whether your map can be distributed to enough users to bring in a beneficial number of map corrections through active crowdsourcing. Google’s distribution channel is massive in terms of size and reach, while Navteq’s distribution channel is modest, being limited to its Map Reporter and similar relatively unknown resources. Yes, users of Navteq data can and do refer their users to Map Reporter or provide other access for contributing error reports, but, in total, these inputs are insignificant.

So Why Are These Things Important?

Field operations are extremely expensive. While the data quality advantage that Navteq now maintains over Google and everyone else is a competitive advantage, it is unclear to me that Navteq has the financial resources to maintain this expensive advantage.

If you remember back to 2008, one of the reasons that Nokia acquired Navteq was that it was contributing mightily to the Navteq revenue steam and was trying to negotiate lower rates, but Navteq would not budge. The negotiation, then, took an unexpected turn and Nokia told Navteq “Your data is too expensive, so we’re going to buy the company.”

Guess what happened next. Yep, those post-acquisition intercompany transfer prices produced the pricing concessions that Navteq would not offer in the license negotiations. Guess what? Yes, Navteq’s revenue stream has suffered because of this process.

Nokia’s latest innovation is that it has decided not to run Navteq like a for-profit, stand-alone business, but to integrate it into some ill-conceived Nokia business unit called “Location and Commerce” (LAC – now they just need to add knowledge and they can call the business LACK) under Michael Halbherr. Larry Kaplan, who as CEO has managed Navteq for the last couple of years and understood the complexities of the business and business model, will depart the company at the end of the year. Michael Halbherr, apparently has limited experience in building map database or in managing map database companies, but heck, he has familiarity of selling products that use the data from his stint as CEO of gate5 AG, another of Nokia’s “successful” acquisitions. Good luck Michael and please try not to kill off the only substantive alternative to Google Maps.

Hmm. I guess that means that Navteq better get used to hearing “Your field operations cost too much and you need to reduce expenses because the Location and Commerce business isn’t producing any revenue other than the revenue you are generating. Of course, your revenue is what is funding our amazing, Finnish-style, sauna-driven management structure and market-mismanagement structure. And, by the way, why aren’t you using more of that low-cost active crowdsourcing as a replacement for your field operations?” (If this question is asked, those of you at Navteq should take a quick peek at the slideshow mentioned in my last blog.)

You may remember that in 2008 I told you that Nokia wanted to become an advertising company. This statement was based on my assumption the only way to make money with maps and phones would be to sell advertisements to people who are searching for local attractions. I concluded then that Nokia did not have the wherewithal to make this transition and I think that opinion is still valid. Nokia’s CEO Stephen Elop indicated that “Focusing on location and commerce is a natural next step in Nokia’s Services journey.” I guess somebody must be familiar with the amazing successes of “Nokia’s Services Journey”, but I am having a hard time coming up with any service related success that the company has achieved.

In my last blog, I mentioned rumors about the SS Navteq being set afloat based on some rumors that I had heard from contacts at several conferences held over the last few weeks. We all knew that something unpopular was going on at Navteq, but my guess was before I became aware of the Nokia reorganization of Navteq. As a consequence, a change in ownership might not be in the immediate future. On the other hand, I suspect that within the next two years, Nokia itself will be sold or decide that it is unprepared to successfully compete in the location and commerce business and sell the location and commerce business to someone, say like NavInfo.

The real problem here is that Nokia is slowly beating the competitiveness out of Navteq and it staff. In addition, I suspect that Google will eventually find the right strategy in crowdsourcing and data mining and begin to create data that can actually be used for navigation, as opposed to routing a table of points, as they do today. When this happens, Google will be able to compete with Navteq across a wide variety of markets. Navteq, of course, could maintain, or even extend their current lead over Google, but it appears that malaise has gripped the organization and it may be that the company’s workforce no longer believes that it can maintain its market leading position. From a practical point of view, I believe that sentiment to be untrue, but I can understand why the Navteq workforce has doubts about the future.

Let’ see how things have worked out in the world of navigation map databases. TomTom acquired and, then, killed Tele Atlas through mismanagement brought on by financial difficulties. Nokia acquired and is killing Navteq caused by general incompetence and financial difficulties.

Who will be left in the world of mapping? Google, that’s who. So Microsoft, if you want to have mapping and routing in the future, either buy Nokia or adapt OSM and do what you can with it. Of course, it might just be cheaper to buy MapQuest, since they already seem to know what to do with OSM.

Happy trails – and Navteq, I’m hoping you will not abandon your market leading position without a fight, although that might involve fighting Google and Nokia.

Now, it’s time for me to fight the final boss in Red Faction Armageddon. Hope I survive.

Next time, unless someone beats me on the head, I plan to explore some new topics. Stay tuned.

Click for our contact information

Bookmark and Share

Posted in Authority and mapping, Google, Google Map Maker, Google maps, Mapping, Microsoft, Navteq, Nokia, Nokia Ovi Maps, OSM, TeleAtlas, TomTom, User Generated Content, Volunteered Geographic Information, openstreetmap | 4 Comments »

Presentation From The New York Geospatial Summit 2011

July 1st, 2011 by MDob

You would think that I have just got up from a long winter’s nap, but I have spent most waking moments for the last month working on a project that was delivered yesterday.

The only spare time I had last month was spent on a presentation that I made at the New York State Geospatial Summit 2011 held in Skaneateles, New York on June 16. I was energized by the crowd. They were wonderful sources of information, as I discovered during breaks and meals. I have posted my presentation from the conference on SlideShare and have embedded it below for viewing. If you click the link at the top of the presentation and view the show at Slideshare you will be able to see my brief presentation notes.

Those of you who have suffered through my presentations know that they are prepared, packed with information of interest to me and delivered in a casual but rapid-fire style. In other words, the slides may not provide the same depth on the numerous ideas I presented during my color commentary of the images. My apologies, but I think you might find the presentation interesting anyway.

This weekend I will be drafting my next blog. It will be the follow-up to the last blog on Google Map Maker and how Google and Navteq will battle it out for supremacy. Well, maybe. I understand the SS Navteq may soon be set afloat and have new owners. Could it be true? Stand by and I will let you hear the story early next week (but after the holiday).

Click for our contact Information

Bookmark and Share

Posted in Authority and mapping, Data Sources, Google, Google Map Maker, Google maps, Mapping, Navteq, Nokia, Nokia Ovi Maps, OSM, User Generated Content, Volunteered Geographic Information, crowdsourced map data, map compilation, map updating, openstreetmap, shameless self-promotion | No Comments »

The Google Map Maker Review and Authority System

May 24th, 2011 by MDob

Please read yesterday’s blog before reading this one. It will make a lot more sense if you do.
Based on my experience with the Map Maker described in yesterday’s blog, the edit system is deeply flawed, at least in its present incarnation. Just so you know that all of those edits really happened – see that MDob in the right corner?

See.  I really did the edits I wrote about

Unfortunately my experience with the authority system in Google Map Maker was, perhaps, even more troubling than my exposure to the edit system. The “background” for this statement is described below. I point out here that I attempted only three edits and received reviews only on these three edits. As you will read later, those three reviews and an analysis of other edits by these reviewers opened the door to a broader view of the Map Maker and its “trusted reviewers.”

The first edit I attempted concerning the lane in the parking lot that was erroneously displayed on the Google Map as connecting to an adjacent street is a prime example of the structural weakness of the edit/authority system used in Map Maker.

While preparing to edit the map, I was quite certain that my recollection that the lane in question did not intersect the adjacent street was correct. I looked closely at Google’s satellite imagery and decided it was not clear enough to allow me to confirm my recollection.

Another alternative to resolve this problem was to use the Google Street View imagery. While it provided clear evidence that the aisle in the lot did not connect to the adjacent street, I could not find details on the date of the imagery and could not resolve the issue due to the absence of metadata on the Street View imagery available in either Map Maker or Google Maps.

I concluded that the only way to determine whether the streets connected or not, at least in this case, was field examination. So I hopped in my car, drove to the location, and did a field inspection. The field inspection closed the question. I took a set of photos as positive proof that my assertion that the lane and the road did not connect was true.

Yes, current satellite imagery or current Street View imagery might have been used to resolve the issue. However, since the Map Maker edit reviewers presumably do not have access to the metadata on the date these images were captured, they cannot definitively determine connectivity or the lack of it in road involved in the edit I contributed. We can, also, presume that, unlike Jason Bourne, Google’s trusted reviewers do not have the ability to retask satellites or redeploy Street View vehicles to resolve the situation.

I suppose the “trusted reviewers” could look at the maps on other online services for this particular location. In respect to the edit in question, the Navteq and Bing websites agree with Google. On the other hand, OSM, Yahoo (Navteq data), and TomTom don’t show the parking area at all. Let see, that’s three in support and three against. What to do?

I suppose Google’s Map Maker reviewers might be tempted to refer to other sources, like I did for purposes of comparison, but in its Moderation Guidelines for Users of Map Maker Google lays out its position quite clearly:

“While moderating, do not post any material that you know, or should reasonably know to violate any law, contractual obligation, confidential information, proprietary information including copyright or the privacy or publicity rights. Google Map Maker expects you to respect and abide by copyright laws and does not condone any violation of copyright law. Users, moderating User Submissions, are expected to be familiar with the applicable copyright laws in their jurisdiction and in case of a doubt the Google Map Maker team encourages you to consult an expert in the field of copyright law in your jurisdiction for guidance.”

Good for you Google. How do you police that action? Do you really expect your volunteer users to consult an expert in the field of copyright law for their jurisdiction and all of the jurisdictions in which they review edits? Hmmm.

Well, for comparison purposes only, I examined the area using imagery available on Navteq’s website, but it did not provide enough detail to provide a clear answer to the question. Bing’s imagery was superior and showed several trees blocking the access of the lane to the street, but there was a lack of metadata about the age of this imagery opening the possibility that the intersection could be a more recent construction than that shown in the imagery. Just so you know, I do not consider a copyright date to be a surrogate for the currentness of the imagery or the map data, since the date of the copyright may have no relationship to the age of the information in the work covered by the copyright.

Given this uncertainty with the currentness of sources and the inadequacies of the Google source data, it appears that local knowledge gathered through field observation is the most immediate and authoritative method that can be used to resolve this particularly natty problem of road intersection, as well as the best way to solve a large class of problems similar in nature. I drove to the location in question, because there was no other way available to remedy my uncertainty about the location I was attempting to edit.

So, what’s the Google Moderator team to do when they encounter my assertion that the parking lot lane does not intersect the adjacent street as shown on Google Maps? How could they remedy this situation without observing it? Yes, they could look at surrogate data beyond that which I used, but do you really think they are going to spend that much time ferreting out other sources? I doubt it. If you were a moderator, would you? And how long would it take you to find useful sources that did not violate the restrictions on intellectual property that Google assert for its reviewers?

Even if the Map Maker trusted reviewers did collect other data for purposes of evaluation, how would they know with any degree of certainty that the source was current and reflected the status of an issue on the ground? The answer is that they would not have a clue what the correct call was in the case of my edit, and in thousands of cases like it. “Trusted Reviewer” Nigar was smart enough to reference Street View, but unless he had access to metadata describing the date the data was collected, Nigar could not have known that his decision was based on the reality of the situation on the ground.

The edit in which I marked the actual driveway into the medical office complex does not look like a difficult call, since it clearly connected with the adjacent street in both the satellite and Street View Imagery. Unfortunately, the imagery appears to lack metadata, so who, other than a local observer, really knows whether it reflects an access point that exists, or one that was reconfigured recently, or perhaps one that has a chain across it?

The decision on the underground horse tunnel edit is, also, troubling to me. The feature cannot be seen on the Google supplied imagery or on Street View and never will be unless they take a Street View bike down the horse trails. So how was the decision made to accept my edit?

You know, we could speculate for a long time about how the edits could have been made, but let’s skip that dance and see what Google wants its editors and reviewers to do.

In its Moderation Guidelines for Users of Map Maker Google tells editors and reviewers the following:

“Moderation Guidelines

You agree and undertake that the underlying intention behind moderating User Submissions is to remove any submission that you know through your personal local knowledge either or inappropriate or factually incorrect and to approve such User Submissions that to your personal knowledge are accurate.

Under a section of the Moderation Guidelines titled How To Moderate are these instructions:

i. Approve- you can approve a User Submission, if from your personal local knowledge you are sure that the User Submission is accurate both in terms of location and its labeling and does not violate the Map Maker Terms of Service in any manner

Guidelines for Users desirous of moderating User Submissions

ii. The following rules shall govern any action taken by you to moderate any User Submission:
You shall moderate only through your personal knowledge of a local place.

Content that should be denied

Content that you know by your personal local knowledge to be factually incorrect, for e.g. a non-existent building or other landmark, a road going over a building or a water body etc.”

Ouch! So all of these attempts at moderating edits are supposed to be heavily weighted towards local knowledge. To me, that makes sense. After all, the benefits to a crowd sourced system are mainly based on the relevant information that local people can provide about the local situations with which they are familiar.
I guess, then, that the questions for the Google Team are 1) “Do the Map Maker Reviewers really have local knowledge?” 2) “How does Google measure that quality?” 3) ”How does Google enforce that requirement?”

To get to those answers, I realized that I needed to know something about trusted reviewers and use the trusted reviewers who evaluated my edits as examples of how the process works. (Please note that the information I found is publicly available and resulted from searching information available from Google and Google Map Maker. Indeed, some of it was provided by the reviewers of my edits. My interest in revealing this information is to point out the weaknesses in Google’s review system and I use this information to make several suggestions for improvement at the end of this blog.)

Starting Point

Lalit Katragadda, one of the inventors of Map Maker, indicated during a interview on Map Maker that “The most difficult part was not the coding, but the structure,” he says. “After all, how do you know which users to trust?”
The article’s author continued noting “As anyone who has asked for directions on the street knows, not everyone can make maps.
So the Google India team invented a software solution that treats each new edit like a separate page. Over time, the machine learns which users are trustworthy. When a user has reliably labeled enough points, he graduates from the system and can moderate other users’ map making too.”

Reliably labeled? Who evaluates this measure? How do they know the edit is correct or incorrect? Gee, that’s a great approach, but what does it have to do with local knowledge?
It would seem, the “trusted reviewers” are those who have successfully edited maps using Map Maker and reviewed edits by other contributors to Map Maker. Apparently, by contributing edits that are approved and by reviewing edits by others you gain “reputation in the system.” I presume that if your edits are always rejected that you get blackballed. If your review of edits by others is reversed on further review, I suppose you get debited and the “cred” that you have in the system is decreased by the ranking system.

While this “cred” system is interesting, it has little to do with measuring the local knowledge of the Google Map Maker Reviewers. In addition, the rating system is not a reasonable method to establish “authority”, although it is way to establish popularity or it could be regarded as a measure of how few actual map users critically view Google maps. Perhaps it could be interpreted as a measure of those reviewers who evaluate solely on the basis of satellite and Street View imagery?

In essence, the major problem with the “trusted reviewer” concept is that the information available the “trusted reviewer” to evaluate a contributed edit is at best comparable to that available to the contributors of edits and usually less valuable since it is not influenced by local knowledge. Based on my limited examination of Google Map Maker, I have concluded that the trusted reviewers in the Google Map Maker System may have limited or no geographical knowledge of the locations that they edit, or for which they review contributed edits.

The “trusted reviewers” who edited my works were identified only by these names: Shalini, Abhilash, Hemant and Nigar. Edits from two of the reviewers (Shalini and Abhilash) provided links to their history of accomplishments using the Map Maker system. I was able to find details on trusted reviewer Nigar, but nothing for Hemant. What I was able to find, however, provided several useful insights.

For example, “trusted reviewer” Shalini, has been editing and reviewing edits in Map Maker for 214 days. During that time, he produced 10,572 edits and 8599 reviews, including edits of 3,245 features. In essence, Shalini has produced an average of 50 edits and 40 reviews of edits each day since he joined. You can review the edits by “trusted reviewer” Shalini here. You might notice that the Shalini reviews and edits include detailed street level and feature corrections in Vanatu, Nigeria, Brazil, Guyana, Dominican Republic, Iceland, Montenegro, Puerto Rico, Paraguay, Azerbaijan, Moldova, Macedonia, Romania, Iran, Viet Nam, India, Nepal, Pakistan, Dubai, the United States and many other countries.

Do you think “trusted reviewer” Shalini uses compilation sources outside of those provided to him and all other users by Google? Of course not! In addition, I think we can safely assume that “trusted reviewer” Shalini is not applying local knowledge to the majority of the edits he makes himself or the submitted edits that he reviews. In essence, the skills Shalini appears to have are familiarity with how Map Maker works and the ability to evaluate the suggested edits based on a belief about whether they are supported by the imagery provided by Google (Street View and/or satellite imagery). If this is the case, then, where is the “personal local knowledge that Google requires of its reviewers? More importantly, where is the “authority” in the Google Map Maker System?

During this research, (fact checking earlier today) I went back to the link to Shalini on the page where he corrected my edit. I was shocked to see that instead of the information provided above (which you can still find by clicking that link as of tonight), it linked to a new page that indicated that Shalini’s stats were now 47 days with 257 edits and 420 reviews. All of the reviews were now exclusively for the United States. How curious. How did this change? Same link location, different days, different results. Ya gotta love the Internet. It’s so authoritative. Well, let’s move the next reviewer.

According to Google, ‘trusted reviewer” Abhilash (details here), has been a member for 45 days, with 739 edits and 275 reviews of edits. The Abhilash review and edits appear to be scattered across the United States with a minor focus on paths and trails. After reviewing several of the Abhilash edits and reviews of edits, I concluded that the reference material used by this reviewer, also, appears limited to Google Maps, Street View and the satellite imagery the Google provides to all users. Local knowledge was not in evidence, given the number of states and localities within which Abhilash contributed and reviewed edits.

“Trusted reviewer” Nigar was not linked to on my edit page, but I was able to find him and in his list of reviews was one the edits I contributed that he reviewed – so he was the Nigar for whom I was searching. Nigar has, apparently, been contributing for 161 days, including 1569 edits and 10183 reviews of edits contributed by others, for an average of approximately63 reviews per day.

It seemed as if it took me forever to page through the dates of his latest reviews of edits when I trying to establish that he was the Nigar who reviewed one of my edits. I became so interested in his productivity that I decided to analyze his activities on Map Maker for a given day. I selected May 16, 2011 for no other reason than I had the idea when I was looking at some of the reviews that he approved on that date.

Did you know that “trusted reviewer Nigar reviewed edits contributed by others on May 16th for 12 hours and 29 minutes straight? During that span he reviewed 79 edits or one every 9.5 minutes. The edits were spread across a number of localities in 21 of the states of the United States.

Numerous edits reviewed for localities in 21 states.

Stunning, isn’t it? But, as they say on television commercials, “there’s more.” Yes, “trusted reviewer” Nigar found time to perform 30 edits of his own in eight states while undertaking his Herculean work in reviewing 79 edits contributed by others. In total, Nigar was involved in reviewing or creating 109 corrections (30 edits, 79 reviews) in 21 states, popping one out every 6.8 minutes over a 12.5 hour period of non-stop map editing on May 16.

Wow, I wish my employees (that’s me) worked that hard! Other pages in Nigar’s portfolio are also filled with large numbers of edits that occur during one day. For instance, on May 20, he edited and reviewed approximately 70 locations. Hmm. Who does this Nigar work for that he can spend so much time editing Map Maker? Or maybe he just has a lot of spare time. Whatever the case, it is highly improbable the trusted reviewer Nigar has a working knowledge of street level geography in localities scattered across 21 states that would allow him to review the edits of others based on his “personal local knowledge of that place” and the imagery information provided by Google.

Scatter graph of Nigar's reviews and edits throughout the day

One of the comments submitted on yesterday’s blog indicated that the Google editing system was complex because it needed to accommodate newbies and “power users”. I admit that I had not thought of people freely volunteering to be power users for a profit-based company like Google. I realize that there are a number of power users working on OSM, but I had always assumed that these contributors were either: a) dedicated to the notion of an open, world-wide street level database, or b) hoping to create an open database that they could eventually use to support a business or another opportunity that might make moolah.

I suppose volunteering all your spare time to improve Google Maps is a possible explanation for the work of trusted reviewer Nigar and if so, Google should at least send him the bedding shown below, which can be purchased here (thanks to Duane Marble for pointing these out to me).

Dreamland for highly incentivized map reviewers.

Preliminary Conclusions

Having taken a look at the contributions of three of the four “trusted reviewers” who reviewed my edits, I think I can provide some preliminary answers to the questions I asked earlier in this blog.

1) “Do the Map Maker Reviewers really have local knowledge?” Although my experience the review system is limited, I suspect that many of the reviewers of Map Maker edits in the United States do not have the requisite familiarity with local places to review edits “…only through your personal knowledge of a local place,” as required by Google. This lack could change over time as more locals become involved in the process, which was opened to them in the U.S. approximately one-month ago. However, Google Map Maker is off to a poor start, at least if my experience is any measure of the process.

2) “How does Google measure that quality (local knowledge)?” I saw no evidence that Google directly measures or tries to measure the local knowledge of its reviewers. It may regard its rating system based on how many reviews of edits a person makes that are accepted or overturned as a surrogate measure of local knowledge, but if so, this is confounding the issues involved in the review process. Further, if few reviewers really have the local knowledge to review edits on the basis of familiarity with local geography, then how authoritative can this system be?

3) ”How does Google enforce the local knowledge requirement?” Sorry to say this, but on the basis on my limited analysis, they don’t. I have a suspicion that they are casting a “blind-eye” on issue in an attempt to build critical mass, but I have no substantive evidence to support that thought, at least not yet.

Summing Up

Based on the admittedly modest amount of research I conducted, I have concluded that the “trusted reviewers” Shalini, Abhilash, and Nigar operate within the research boundaries of the information normally provided in Google Maps and mirrored in Map Maker. It does not look like they could possibly bring local knowledge to the majority of their edits or their reviews of edits. Conversely, Google’s own guidelines require that “your personal local knowledge” be used in making and reviewing edits. If this is true, how can these reviewers who evaluate contributed edits submitted from a wide variety of geographic locations and other reviewers who presumably operate in the same manner qualify as “trusted reviewers”? Just to be sure you understand me, I am not suggesting that these reviewers are being dishonest, just that they likely do not have the required local knowledge to critically evaluate the edits they review.

If it took only a little digging on my part to unearth this problem, can Google be unaware of it? Hmmm. Wouldn’t it be something if some of these “trusted reviewers” were Google employees, or compensated by Google, or maybe not even be located in the United States or never have been in the United States? You know, just to get the old crowdsourcing ball going in the U.S. and moving it along to generate that critical buzz that will provide the growth required to make crowdsourcing input large enough to get the edit ball rolling? Now I get to use a phrase I hate, but can’t resist using here – it is “I’m just sayin…

However, we need to remember that crowdsourced systems are considered self-healing over time. Edits are pushed out so that other people can see them and correct them, if necessary. The crucial issue is whether there are enough edits to provoke reactions/ corrections in the population that uses the map base. More use and correction is thought to improve the quality of the data. Whether this self-healing actually occurs in a crowdsourced system remains unclear at this time and should be a fertile area for those interested in researching crowdsourcing and map compilation.

The use of crowdsourced data to solve the types of editing problems common in map compilation is one that might not work out well in the U.S., at least not based on Google’s current approach. It is my sense that Google does not want to create a field organization similar to employed by Navteq, nor does it want to manage an operation that is not driven by algorithms. In other words, using a volunteer, self-organizing workforce is a big gamble for Google and it is one that has a reasonable probability of failing if the volunteers are left to their own devices.

Recommendations for Improving Map Maker

So, what can Google do? I think you know better than I. However, here are some suggestions (including several contributed by my colleague Steve Guptill).

1. Add improved structure/taxonomy to the features and objects that Google is willing to crowdsource.

2. Provide alternative paths/solutions/tools when current methods do not allow the user to transfer the type of edit information they had hoped to contribute.

3. Smarten up the edit system. There are too many complex objects that cannot be edited or corrected and it is these data that are major flaws in the Google Map Base.

4. Make metadata available to the satellite and Street View Imagery, but only if you really want it to be the crucial factor in edit decisions.

5. Track user IP and use this as part of the process to evaluate whether an edit contributor or a reviewer of edits might have relevant local knowledge.

6. Map Maker is buggy. Fix it.

7. Improve the boundary data files describing the location of edits. Users need to have confidence that you know what you are doing and Map Maker does not yet provide that confidence.

8. The decision process in reviews needs to be more transparent and helpful. A colleague had an edit rejected when he attempted to inform Google through Map Maker that a local chain of banks had changed their name. Google’s reviewer said “Nope.” My contact provided links to sites indicating the name had changed. No reason for the rejection was provided or any alternative solution. Good luck with that kind of thank you

9. For alternative approaches, talk to people who have experience with map compilation relevant to the type of map database you need to support your other strategic initiatives.

10. Spend more time focused on human factors and interface design relevant to map compilation systems.

There are a host of other things that occurred to me, but this blog is already too long (especially when added to yesterday’s opus), so I stopped the list.

Insights

I offer two final insights.

1. Google Map Maker was developed to assist in the creation of maps in countries where maps were either lacking or so expensive and copyright protected, that they were unavailable to the ordinary user. Whether the specific crowdsourced system of map compilation created by Google in the form of Map Maker, which is based on the scarcity of public maps, will provide an advantage in the United States, a country with a rich heritage of publicly available free maps, low cost maps and free and low cost map data, remains unclear to me.

In countries that lack a viable, public map infrastructure, it is likely that crowdsourcing is a viable method of creating detailed, street level databases. In these cases, the majority of the important contributed knowledge (street names, alignments, route numbers, addresses, points of interests, directionality, etc.) would be provided by local users, since there is no national source that can freely be used as a reference.

But will this model work in countries where national and local maps from authoritative sources are available online from a variety of sources? In these situations, will people be incentivized to contribute local knowledge that could improve the map or choose to serve as remote librarians, providing recommendations based on what they can observe online at other map sites without reference to or observation of local circumstances? If so, this becomes a game potentially unattached to local knowledge and one that will not significantly benefit the quality and accuracy of Google Maps.

2. I was amused to find that those editing data in the Map Maker system can do so because they are “editing spam data”. Ain’t life grand in the world of crowdsourcing? How do you recognize spam data? I mean, I’d know it if I ate it, but I am not sure I could recognize it if I saw it. What’s the difference between spam data and a mistake? I suppose its intent, but what tool do you use to measure that?

Google Map Maker - Home of the famous spam edits.

I had intended to compare Map Maker and Google’s efforts at map compilation with those of Navteq as part of this series and will do so next time I put electrons to plasma. But before that, I am going to do something else. I got so interested in the last two blogs that I spent too much time on them and not enough on my consulting practice or on my other interests – like maybe playing that new 12 string guitar my wife Bonnie gifted me with for my birthday last week.

On June 16, 2001 I will be speaking at the 2011 New York GeoSpatial Summit. It looks to be a very interesting day and I hope to see some of you there.

Click for our contact Information

Bookmark and Share

Posted in Authority and mapping, Data Sources, Google, Google Map Maker, Google maps, Mike Dobson, Navteq, User Generated Content, crowdsourced map data, map compilation, map updating, openstreetmap | 7 Comments »

Google Map Maker’s Edit and Authority System – Part 1

May 23rd, 2011 by MDob

This and the blog I will publish tomorrow are part of a series examining Google Map Maker and how its edit and authority systems function. I apologize for the delay in posting this work, but I decided to wait for the reviews of my edits, as the reviewing process is the most critical component of the Map Maker system. After I finished writing my analysis, I realized that the document was too long and have chosen to publish it in a two installments, one today and the next tomorrow.

Caveats

1. I am not an investigative reporter and do not look for trouble on purpose. However, I find that when you look, poke and prod, you often find things that you did not expect. This length of this blog is a prime example. I had not expected the topic to require more than one essay. I was wrong.

2. My purpose in contributing edits to Map Maker was focused on finding out how they would evaluate these edits and accept or reject them. As a consequence, tomorrow’s installment provides a look at the authority system behind Map Maker, while today’s installment is a brief review of what I encountered when I tried to use Map Maker to edit the Google Map Base.

3. I apologize in advance for generalizing about some of my conclusions. However, when you read the information described in this series, I think you will agree that the conclusions are reasonable and, perhaps, something that someone should look at in more depth.

4. Google, you put this stuff out there, so blame yourself.

5. In the way of background, the Google Map Maker software was invented at Google India, in Bangalore by Alit Karaganda and Dimple Bart and launched in India in 2006. An article at livemint.com provides details of the history

6. I looked at the Google Map Maker videos on You Tube, read the pages for beginners and the available help files. None of these were very detailed, but it looked like that was as much as I was going to find. There may be other materials that help the newbie figure out how to work with Map Maker and I apologize if I missed these documents. But what the heck, let’s see how Map Maker works.

The entrance link to use Map Maker to edit Google Maps

I went to Google Maps, zoomed into Mission Viejo, California, clicked on the “Report a problem” tab at the bottom of the page, and, then, clicked on the “Now you can edit the map yourself, on Google Map Maker”. After having tried to use Map Maker to correct Google Maps, I think the ‘Map Maker link’ should be reworded and phrased as a disclaimer. Perhaps something like this would work.

“You (as in you yourself) might or might not be able to edit this map using Google Map Maker. However, in either case, you should proceed only if you have a high tolerance for goofy editing systems and you are willing to ignore the numerous mistakes in our map database that we will not, under any circumstances, let you correct. Further, our trusted reviewer of your work might or might not be personally familiar with your local geographic area and might not find your edit acceptable, even though it reflects reality.

Further, in order to use this system, you must sign-in to your Google account to establish your identity, even though we might regard your edits as ‘spam edits’ and disregard them. This cute little trick helps us to increase the number of our Gmail accounts and allows us to read your Gmail -email to better target local ads for goods and services that you may or may not be able to find on our maps, regardless of your intent to help us improve the representation of local geography in Google Maps.”

However, since there was no disclaimer to be found, I decided to attempt to use Google Map Maker to enter several map corrections I had researched. Unfortunately, I found the system so lacking and difficult to work with that I gave up after trying to deal only with three simple edits. I decided that attempting the complex edits I had planned was asking too much of a volunteer.

I had expected some difficulties, but Map Maker just isn’t that hunk of burning funk for which I was hoping. Let me be blunt, based on my experience, Google Map Maker is a system put together by software engineers who apparently do not understand best practices in map compilation. Nor do they appear to have spent any measurable time studying the Human Factors aspects of computer-user interfaces. On the other hand, I guess if you had never used anything else to edit maps, Google Map Maker might seem like a godsend. In a similar vein, you could tell a person who had never seen bacon before that it was Zebra meat, and they might accept your description. But perhaps, I have put the cart before the horse. Let’s look at the simple edits I tried using the Map Maker edit tools.

First, I had noticed that some streets in my development had small cul-de-sacs that were unnamed on Google Maps and thought that I would help them out by supplying the names. To make sure that they were not named, I zoomed in as far as possible and, sure enough, names did not appear on these stub-streets. However, when I put the icon of the “Street View man” on these stubs, I soon found out that Google Maps did know the street names. In addition, sometimes when I right-clicked the stubs and queried ‘what’s here’, the service would provide me the name of the stub.

I find this lack of name display curious. At several of Google Maps’ higher zoom levels there would be no difficulty rendering these street names on the display, nor any visual design reason not to do so. I guess Google must regard them as undesirable map clutter.
So, Google put this in your Map Maker play book – if you want people to edit your maps, show them what is on your maps. Don’t make them hunt for data to correct, as they have better things to do with their time. (While I am on this brief diversion, the mapping functionality on Navteq’s website did not show the stub names for the same streets, leading to this question, “Why are the display properties set for these two map databases identical?” Could it be…? Oh well, that’s another topic.)

Edit 1

Scanning further around the neighborhoods with which I am familiar, I found a blunder that I thought would be easy to fix. While representing the parking lot associated with a local medical office complex, Google had mistakenly shown one of the lanes in the parking lot as connecting to a local thoroughfare. Below are two photos taken the day of the edit, showing the end of the lane in the parking area from the west (the parking lot side) and the east (from the street-side). It is relatively easy to see from the photographic evidence that the parking lot lane does not connect with the street (Marguerite Parkway).

View of the parking lot aisle from the street

View from the street side of the parking lot

The Google map of the parking lot is shown below and the lane in the parking lot is shown as intersecting with Marguerite, rather than ending approximately twenty feet to the west, as it does in reality.

Parking lot aisle as shown in Google Maps connecting to Marguerite

Depending on whether or not the designers of the Map Maker system contemplated allowing users to interact with connectivity and topology, I realized that I might need to edit the intersection with Marguerite, as well as the line segment itself. So, I started with the Intersection, but that did not work out well.

Intersection nodes as shown on Google Maps

After grabbing the intersection, I noticed the error message below and could find no way to recover the situation or to save my work. Oh well.

The previous action resulted in an unrecoverable error.

The error message reads “Incorrect address. Internal Error: Bad Data (Bad Feature). Please report this error with a link to this page.” I had no idea to whom I was supposed to report the error while including a link to the page. I admit, I was shocked. Has Google not heard of error trapping? Does the system not store the what, where and the specific page where I was editing? Curious.

I thought that, perhaps, the next best path to take would be to edit the street segment, rather than the inappropriate intersection. However, when I attempted to do so, the edit page revealed that I was editing Multiple Road Sections and warned that “The geometry of this feature cannot be modified.” Hmm. Well, then, what exactly was I supposed to edit to remedy the situation? I looked for alternatives, but there appeared to be none that were appropriate. The only option that made much sense was to leave both a description of the error and a separate comment indicating that the segment was incorrectly represented and that it did not intersect with Marguerite, noting that the lane terminated at the edge of the parking area. I did so and saved the information for Google’s review.

I have to admit that the intersection error continued to nag at me, so I went back for a second attempt at editing the intersection. However, in the Google Map Maker system, it appears that if someone is editing any line associated with the intersection, you cannot edit the intersection that is attached to it until a moderator decides whether or not the edit suggested for the line segment will be accepted. Locking features can be a good idea in interactive systems, but not in one in which the moderation necessary to release the segment takes over a week, as happened in this case. Well, I have no one to blame but myself, but it is hard to imagine that leaving a comment and not changing any element would lock an edit system.

I guess this is an example of Map Maker’s highly inefficient, volunteer, moderator-gated approach to map editing. After all, I and other contributors have nothing better to do than to come back at some later point in time to make another attempt at correcting the issue that I could have corrected when I was at the website and which they could have cued-up for future resolution. However, that would take more “programming smarts” than Google apparently decided to put in Map Maker. Or maybe this just reflects that the edit system is so heavily moderator driven that corrections depend, at least theoretically, more on the local knowledge of the moderator than on the topological data available to the system. Unfortunately, the speed of the process, also, seems gated by the availability of moderators who can evaluate the validity of the contributed edits.

A mere seven days later, the edit for this feature was accepted by “trusted reviewer” Hemant. Unfortunately, the road geometry remained unchanged on Google Maps and the parking lot lane still intersected with Marguerite. In response, I wanted to tell them that the critical issue had not been resolved by editing, but apparently could do so only by editing my previous action, which you might remember was a comment. However, once again, it appeared that all I could do in this situation was to add another comment, which, of course, locked me out, once again, from editing the intersection.

However two days later, a new “trusted reviewer” identified as Nigar must have read my second comment and responded “Hi, as per the input provided by you and the street view I get to see that this segment of road does not intersect with the ‘Marguerite Pkwy’. I am approving the edit and making the necessary change. Thanks.” Unfortunately, the intersection has still not been corrected on Google four days after the receipt of the approval of the edit.

Edit 2

I had better luck suggesting to Google where the actual access to the parking lot in the medical office complex was located.

Actual access to medical complex from Marguerite

When I reviewed my edit a few days later, I was shocked to learn that I had picked up fluency in a new language that also includes unique symbolic phrases. How could a reviewer make sense of this gibberish?

Hmm. What language is this?

However, the edit was accepted nine days later, by “trusted reviewer” Shalini, who provided this note “Hi, thanks for the edit. The ‘Segment usage’ can be left as ‘None’ and there is an intersection error. The road need to be connected to near by road. I will do the edits for you and approve.”

Yes, all true. I tried to grab the end of road to extend it to intersect Marguerite, but simply submitted it as it was after several tries where it appeared that my mouse’s on-screen pointer must have been coated with grease. Curiously, I have heard from others that they seem to be having the same problem with Map Maker when attempting to extend lines by grabbing a node and moving it.

Edit 3

Next, I tried to add an underpass/tunnel that crosses under a local, divided street. The tunnel under Oso Parkway is actually a horse crossing tunnel, but one that is used by pedestrians who like to walk along the horse paths in this area. In order to accomplish this task I added a line to show the beginning and ending of the tunnel, but soon found out that there was no attribute for a “horse underpass” (at least not that I could find). I decided that I would categorize it as a Pedestrian/Bicycle underpass and correct the notion with a comment.

However when I later reviewed my comment, I found that I was no longer writing Mandarin, but I had now somehow decided to write in a form of HTML. How was a reviewer supposed to understand this gobbledygook?

Location where a tunnel serves as a horse crossing underneath the adjacent road

This html-speak scrolled so far beyond my endurance that, so I soon lost interest in whatever it may have been attempting to tell me.

You may have noted from the image that the underpass is located in Galivan, Ladera Ranch, notations which were provided by Google, followed by the names of locations that I added during the edit process.

Galivan is designated by the U.S. Census as a Populated (Community) Place or U6, which is defined by the Census as “A populated place that is not a census designated or incorporated place having an official federally recognized name.” Google, following Navteq’s lead (hmm – or maybe that’s everyone following GDT’s or, perhaps, ETAK’s lead) shows Galivan on their maps.

I’ve got news for the Census, and all the mapmakers. Galivan is not a populated place. Even the commercial Storage Center that is within fifty feet of the center of Galivan is not in Galivan. As a matter of fact, Google shows Galivan right between a pair of railroad tracks. Of course, since this is railroad right-of-way, there are no buildings here that could be construed as Galivan.

The storage center is not located in Galivan, but then, nothing is located in Galivan.

There is no Galivan. It does not have a federally recognized name, because it appears there is nothing to recognize – no people, no home, no boundaries, nada. Of course, everyone can be forgiven for this since Galivan is listed in the Geographic Names Information System (GNIS) maintained by the USGS. However, if you look closely, you will find that this record was first picked up in 1981 from a 1:24000 Topo Map and the Decision Card on the feature reads “No Data Found”. Oh those cartographers. What a sense of humor. In addition, what a cruel trick to play on those conflation routines! Of course, this is exactly the type of situation that could benefit from local knowledge, if only Map Maker effectively solicited local knowledge.

Ladera Ranch, the other entity mentioned by Google as a related location in the tunnel image, is an unincorporated, planned-community (no official boundaries), which, unfortunately for Google, is located in the eastern portion of Mission Viejo, which is east of Laguna Niguel, which is located east of the mysterious Galivan, which is located east of Laguna Hills, where the underpass occurs within the borders of a Laguna Hills community known as Nellie Gail.

However, my edit was accepted after 9 days by the “trusted reviewer” Abhilash. Well, the tunnel was accepted, but not my information about the horse crossing or Galivan Yep, the accepted feature is a pedestrian crossing located with this string of modifiers “Galivan, Ladera Ranch, Laguna Hills, Orange, California, United States 92653.” It seems that my input on localities made the situation even worse.

Next Steps

As noted at the start of this blog, I gave up my editing career after these three attempts. I admit that I do not have a high tolerance for the kind of goofiness that I ran into with Map Maker and others may have more success than I. Another reason that I gave up was that the feedback was taking so long to get back me that I lost my enthusiasm for the process.

However, when I got the feedback and saw the reviewer comments, I started thinking about the feedback and the data I had provided in the edits and began wondering, “How did they decide to approve my edits?”

The tunnel edit was particularly interesting since the feature cannot be seen in Google’s satellite imagery, nor can it be seen in the Street View Imagery, raising the question, as to what evidence “trusted reviewer” Abhilash relied on, other than my say so. Since this was my second edit, I had presumed that I had no “cred” in the system. How, then did they decide? I’ll publish a critical analysis of the Map Maker Authority System tomorrow. Trust me, you won’t want to miss it. It’s a corker.

Click for our contact Information

Bookmark and Share

Posted in Authority and mapping, Google, Google Map Maker, Google maps, Mike Dobson, Navteq, User Generated Content, crowdsourced map data, map compilation, map updating | 3 Comments »

Google Map Maker Goes Crowdsourced in the United States on “Judgment Day”

April 27th, 2011 by admin

Last Tuesday was a scary day for me. After listening to a Discovery Channel show about how the Pacific Coast of the United States was the next in line for a catastrophic earthquake, I decided to pay my California Earthquake Insurance, which was billed at $666. I hoped this was not an omen. Later in the day, I was on the road driving to Santa Clara, California to attend the Where2.0 Conference. As many of you know, last Tuesday, April 19, 2011, was “Judgment Day”, the day that Skynet, the AI-based, U.S. Military Defense System, went live (at least according to the television series based on the Terminator movies, which is perhaps more authoritative than the Discovery Channel with some of you).

While driving, I was waiting for my XM- Satellite Radio to announce that the Where2.0 meeting had been declared illegal by Skynet, followed by a message that my GPS signal was now being managed by Skynet and that I needed to turn around and go home, since knowing where I was going was no longer my concern. Thankfully, Skynet did not go live. Alternatively, Tuesday, April 19, 2011 was the day that Google decided to announce it was opening up its U.S. Mapbase to crowdsourcing through Google Map Maker – perhaps not as shocking as a Skynet stand-up, but maybe more important.

And the Crowd Went Wild

Apparently Google’s announcement was cause for celebration among those who would now be allowed to donate their time and industrious endeavor correcting and augmenting maps for Google while allowing Google to own the copyright in and to the contributed geographic information, as well as to sell the rights to use it though the company’s new Google Earth Builder product. Pretty cool, huh? Or is this the reason that many people contribute to OpenStreetMap and ignore Google’s efforts?

Capitalism aside, the nagging question is “Why has it taken Google so long to add active crowdsourcing as a tool in its map compilation efforts?” (Note – for those of you who do not read this blog regularly, I regard “active” crowdsourcing as a situation where the person contributing map data has to take an active role in the process, such as using his or her computer to enter data they have endeavored to collect. “Passive” crowdsourcing involves the use of probes, such as Personal Navigation Devices that record the user’s path as they drive and require no specific, dedicated effort on the part of the user.)

It was quite clear that Google was not going to be able to pull away or even reach parity with Navteq in terms of map database quality using its standard approach to data fusion. Those of you who have followed my blogs on this topic will remember that in the summer of 2010 I addressed the value of local knowledge in map updating in a multi-part series titled “Better Maps Through Local Thinking” (the concluding article is here – it has some good stuff in it if you have not read it before). You may also remember that in January of 2010 I wrote another series on Google and map updating and Part II of the series has some useful illustrations supporting my contention that Google needed to turn to crowdsourcing to improve its data fusion process. Another blog from January 2010 titled “More on Google’s User Generated Content Tower of Power” predicted that Google would eventually need to find a way to provide meaningful incentives to prompt its map data gatherers to continue to provide “free” updates for the company’s spatial databases.

You know, Google could make this a lot easier on by talking to me about its plans and asking for advice, but since that seems unlikely, I am going to tell you the changes that I think you will see Google undertake to improve the quality of their map database in the United States now that they have opened it to crowdsourcing – but first a little “color” commentary on my point of view. (Of course, if Apple wants to ask me for some advice on strategy, I would be willing to help out ( I guess Google’s form of capitalism is infectious).)

Perspectives

1. Regular readers of this blog know that I am a fan of crowdsourced spatial data. However, my position on crowdsourcing is that it is just another of the numerous tools available for collecting spatial data. Similar to other tools, crowdsourcing has benefits and weaknesses. Whether Google can harness the power of crowdsourcing in a manner that improves the quality of the data in its U.S. database will depend how the company implements and evolves the crowdsourcing platform (currently known as Google Map Maker).

2. I may be the only one alive who thinks the Google announcement was interesting because it marked a capitulation by Google in the map compilation wars. Yep, relying only on data fusion was not going to get Google to the accuracy level required for success in the navigation and advertising markets. The error correction process implemented by Google to remedy inadequacies in their map database was so inconsistent that users were losing faith in the system’s ability to recognize and retain corrected spatial information. When Google was forced to abandon their algorithms and manually change their data due to user outrage over a particularly inept gaffe, the company would eventually revert back to an algorithmically generated, but inaccurate depiction of the situation based on the fact that they had ingested a source that they felt was more authoritative than the data provided by the users on the ground who had asked them to correct the original error.

The most recent case of this that I know of, documented by Mike Blumenthal in his blog , involves the trials and tribulations of the towns known as the Yorks of Maine in their effort to remain on Google’s map of the United States. Another situation was sent to me by Duane Marble. The article noted that Google had bowed to public pressure from the government of Brazil and agreed to remap its depiction of Rio, as it seems to have represented the city as being one large shanty town with few other attractions or neighborhoods. With the decision to accommodate crowdsourcing as a compilation tool, it would seem that Google has finally come to understand that map editing is something that cannot yet be solved by algorithm – or at least not using the approach they have adopted.

As a consequence of dissatisfaction with its past attempts at a data fusion approach to map compilation, Google has now decided to open the doors to crowdsourcing and hopes that somehow spatial accuracy will result from listening to millions of opinions about the current state of geography on the ground. How many opinions? Well, we really don’t know, but at the recent Where2.0 Conference, Marissa Mayer, the Google VP for Local, showed a slide indicating that Googlers around the world spend one million hours browsing geo-content each and every day. The only other data point that I have was the one mentioned in this blog in February 2010 in which Google indicated that every hour of each day it receives over 10,000 corrections or additions to Google Maps. I guess we can conclude, without much difficulty, that Google should have a significant user base willing to provide corrections, updates and augmentations of its U.S. map base.

However, numbers alone do not tell the tale on the utility of crowdsourced systems. What are the some of the concerns for Google in its use of crowdsourced data for map compilation?

1. While crowdsourced data is thought to add “local” knowledge to street level mapping, there is no systemic limitation on people contributing change information for areas far from their home location. For example, many of the contributors to OSM in the UK focus their contributions on digitizing streets from satellite imagery across the country, including areas with which they lack familiarity that are distant from their homes and for which they do not know any of the attributes that apply to the streets and roads that they may digitize in these areas. In effect, it is unclear whether or not the “local” information that is contributed by crowdsourcing is contributed by local people who actually might know what is happening on the ground in a given location.

2. Where does all this crowdsourced knowledge actually come from? I do not doubt for a second that some of could be gathered by field observation. But is it? Yes, but how much? Years of scientific research suggests that most people have a relatively poor memory for spatial location and are even worse remembering the specific positions and attributes of objects within these locations. I suspect that some crowdsourcers will pull out their Rand McNally Road Atlas or Thomas Brothers Guide to find the exact names of those streets and items they cannot remember. Oh my, did I infer copyright infringement? Maybe, maybe not, as it remains an open question, at least in the United States, as to whether a compilation of facts such as a map database deserves any copyright protection.

3. Some geographic data aren’t visible and will be difficult for the casual crowdsourcer to collect. For example, legal borders that are marked on maps in all their glory are often delineated on the ground only by the occasional sign along a street that indicates you are entering or leaving a legal jurisdiction. While the U.S. Census provides details on these boundaries through their yearly Boundary and Annexation Survey, sometimes these data are not publicly published for several years. Other “invisibles” include bus routes, park boundaries, property boundaries, and rights of way, etc. Perhaps Google intends to provide these data from its fusion process, but that won’t work out too well for them. So, just where will the crowdsourcers get these data – see the discussion immediately above.

4. Haklay’s research into OSM’s mappng efforts in the UK indicates that there may be a socio-economic bias in the OSM crowdsourced data for the UK towards providing comprehensive coverage in more affluent places. Not everyone has access to broadband or a computer, or has the spare- time to sit and digitize Google’s imagery all day, or, perhaps, the spare-time and know-how to report a local map error that has caught their eye. By extension, it is likely that data contributions in areas of U.S. cities that have high crime rates and urban blight are areas largely unfamiliar to the sample population who might be willing to contribute crowdsourced data to Google. In essence, it could be difficult to manage crowdsourcing in a manner that produces comprehensive map coverage across large geographical areas. See below.

5. Next, the efficacy of crowdsourcing is some function of the geographic distribution of people willing to contribute data to the effort. The vast majority of the mileage in paved streets and roads in the United States is in rural areas, where low numbers of potential contributors may limit the comprehensiveness of the coverage that could be provided through the use of crowdsourced data. However, if Google is interested in having reasonably accurate maps only in “urban” areas, crowdsourcing could serve them well. (Indeed, I understand that a company in the Seattle area (not Microsoft) is using passive-probe data (similar to that used to collect traffic data) to build a street database of the twenty largest cities in the United States).

6. It is, also, important to note that there is not a focused, organizing force urging Google’s crowdsourcing contributors to complete coverage in location X by date Y. Instead, the schedule of coverage is rather like topsy, it continues to grow at its own pace, where and when contributors feel interested in adding, correcting or augmenting data. Time frames and formal correction cycles are not a part of the crowdsourced world. Having a crucial error in your data that needs to be fixed may not be of interest to the contributors of crowdsourced data that you attract – at least not without incentives of some sort.

7. Recent significant research by Girres and Touya on the quality of the French OpenStreetMap Dataset raised questions on the heterogeneity of the crowdsourcing process, the scale of production and the compliance of contributors to standardized and accepted specifications. The authors concluded that OSM has great promise, but that its data was of variable quality and would remain so until the tension between “openness” and standardization of requirements was restructured with specific requirements for data entry and attribution. It is my belief that the structure of the contributed information will plague Google, unless it imposes more rigorous constraints than Map Maker has today.

8. Google representatives, as discussed in an earlierblog have made this statement “Carefully considering Google’s mission, guidance from authoritative references, local laws and local market expectations, we strive to provide tools that help our users explore and learn about their world, and to the extent allowed by local law, includes all points of view where there are conflicting claims.” Let’s see, “Google’s mission is to organize the world‘s information and make it universally accessible and useful.” How does that part about local market expectations jive with the mission statement? Is Google’s map comprehensiveness governed by organizing the world’s information conditioned by local market expectations? Will Google’s vetting process pay as much attention to a map change in Truth or Consequences, New Mexico as it will to one in Dallas, Texas? Google seems to have avoided proclaiming itself as agnostic in respect to map changes. Reference publisher really cannot afford to do that – they need to stand for something, but in Google’s case it is not exactly clear what that might be and how it will influence their evaluation of crowdsourced map data.

9. Due to their “always editable” status, crowdsourced spatial databases are constantly changing and, as a consequence, their error signatures are considered (at least theoretically) to be self-healing over time. The reason that most crowdsourced systems update in near-real-time is to provide other contributors the opportunity to correct erroneous representations as soon as possible after these data have entered the system.

Whether the healing of crowdsourced map databases actually takes place in a uniform and helpful manner is a complex issue that involves interactions between the number of participants contributing spatial data, their intentions and motivations, their interest in contributing data over long periods of time, and the spatial distribution of these contributors required for comprehensive map coverage. In essence, it is an open question whether or not the spatial data quality of crowdsourced mapping efforts can be managed to meet specific requirements on a timely and reliable basis. Curating crowdsourced data can be especially vexing and this is likely the key problem that Google faces going forward with crowdsourced data in the United States.

So what, if anything, is Google going to do about these problems?

Google representatives have indicated that the company wants to use crowdsourcing as a method for harvesting map corrections, as well as for collecting data elements that you usually do not find on maps, but that people use. Their goal is to harvest local knowledge to realize these goals and the keys to making crowdsourcing work for Google can be found in the edit and authority systems they intend to use. However, it is my opinion that Google does not yet have its approach set up right and will need to change it over time to gain the benefits they desire. Next time, I will write about the Google edit and authority system, compare their efforts to that used by Navteq and prognosticate who should have the better data and coverage, along with a discussion of the wild cards that will make this competition quite interesting.

References

1. Haklay, Mordechai (Muki) and Clare Ellul. (forthcoming). Completeness in volunteered geographical information – the evolution of OpenStreetMap coverage in England (2008-2009). Journal of Spatial Information Science

2. See Girres, Jean Francois and Guillaume Touya, 2010, Quality Assessment of the French OpenStreetMap Dataset, Transactions in GIS 12(4), pp. 435-459

Click for our contact information

Bookmark and Share

Posted in Apple, Authority and mapping, Google, Google Map Maker, Google maps, Mike Dobson, Navteq, crowdsourced map data, map compilation, map updating, routing and navigation | No Comments »

Various Jottings and News You Can Use

April 14th, 2011 by admin

Just some wrap-up stuff today, but still things that you might find interesting.

Along with two of my colleagues, I spent a great deal of last year wrestling with how we might advise the Geography Division of the Bureau of the Census on the development of its proposed Geographic Support System. Leslie Godwin of the Geography Division managed the projects and we would have been lost without her able assistance.

The reports we created are now public and available from the Census at this address.

The Census has affixed a note to each of the reports and I have posted it here so that there is no misunderstanding about the nature of the effort and the fact that the reports express the views of the authors and not the views of the Department of Commerce of the Bureau of the Census.

“ In the Fall of 2010, the Bureau of the Census, Geography Division contracted with independent subject matter experts David Cowen, Ph.D., Michael Dobson, Ph.D., and Stephen Guptill, Ph.D. to research five topics relevant to planning for its proposed Geographic Support System (GSS) Initiative; an integrated program of improved address coverage, continual spatial feature updates, and enhanced quality assessment and measurement. One report frequently references others in an effort to avoid duplication. Taken together, the reports provide a more complete body of knowledge. The five reports are:
1. Reporting on the Use of Handheld Computers and the Display/Capture of Geospatial Data
2. Measuring Data Quality
3. Reporting the State and Anticipated Future Directions of Addresses and Addressing
4. Identifying the Current State and Anticipated Future Direction of Potentially Useful Developing Technologies
5. Researching Address and Spatial Data Digital Exchange and Data Integration

The reports cite information provided by Geography Division staff at “The GSS Initiative Offsite, January 19-21, 2010.” The GSS Initiative Offsite was attended by senior Geography Division staff (Division Chief, Assistant Division Chiefs, & Branch Chiefs) to prepare for the GSS Initiative through sharing information on current procedures, discussing Initiative goals, and identifying Initiative priority areas. Materials from the Offsite remain unpublished and are not available for dissemination.

The views expressed in these reports are the personal views of the authors and do not reflect the views of the Department of Commerce or the Bureau of the Census.”

You may remember that I wrote a blog about the Mechanical Turk and HITS. Sometime after that I received an email from Mike Blumenthal about the difficulties that Yelp had run into dealing with the Mechanical Turk. The Yelp blog and the articles referenced make for good reading on this topic.

However, before you shut the door on the concept of HITS, take a look at this study whose investigator has been able to harness the concept and make the work fun.

Next, I was asked by Kevin Dennehy of the LBS Insider (published by GPS World) about the potential impact of AT&T’s acquisition of T-Mobile on LBS, Apple’s efforts in mapping, and my thoughts about the CTIA Conference. You can find my responses here and there are some interesting statistics in the AT&T section of the interview.

Last week I drove to Redlands, CA and had lunch with Jack Dangermond, CEO and founder of ESRI. This was a social event and not consulting or sourcing for this blog, but it was fun getting Jack’s insights on the industry and the trends that have emerged over the last year or two. I have been making the drive to Redlands for lunch with Jack every few years and am convinced that the Esri Campus is now bigger than the town Redlands. The new ESRI Conference Center is stunning – you know, there must be real money in GIS.

While at Esri, I was able to spend a few minutes with Don Cooke (formerly with TomTom TeleAtas and founder of GDT before that) and managed to catch-up of Esri’s interesting leap in to the world of community maps. I’m still rolling most of the ideas around in my head, trying to figure out the advantages that accrue, but I am not quite there yet. More soon.

Next week I am at Where2.0 to take the pulse of that segment of the market. Should be interesting and I hope that I see you there. I posted a watercolor of me in the attendees section. I will be one of three, white-haired, “old” guys roaming the conference.

Click for our contact information

Bookmark and Share

Posted in Apple, Authority and mapping, Geospatial, Mapping, User Generated Content, Volunteered Geographic Information | No Comments »

TomTom – Tele Atlas Not an Acquisition Target?

March 24th, 2011 by MDob

I’ve finally revived from my long winter nap, hoping that something interesting would have happened in the world of location while I was “sleeping”. I keep looking through my news sources on acquisitions, tools, initiatives, products and research related to spatial data in one form or another, but it appears that a miasma of lethargy has set in, holding the industry in a stand-still. So, I guess we need to focus on what little there is of interest.

This morning, for instance, TomTom announced that its mapping unit, the former Tele Atlas, is not for sale, but thanks for asking. Of course, it had to announce something since it stock has zoomed twenty Euro cents (3.2 percent) before falling back to close at an increase of thirteen Euro cents (a 2.2 percent increase). The announcement from TomTom was made by Taco Titulaer, who is the head of Investor Relations and Financial Communications at TomTom NV (at least that is how he is identified by Bloomberg Businessweek).

Taco, who was quoted in this article from Reuters, was very specific, when he said “Our content assets are core to our strategy and product offering”. He added “TomTom is not considering to divest or sell those assets, which includes mapping.” Seems to me that Taco needs to start reading this blog.

Hmm. Let’s see how well this core strategy has been working. While TomTom acquired Tele Atlas for €2.9 billion in 2008, at the close of the market today the combined entity was valued at just under €1.4 billion. Guess the synergies that stoked TomTom’s ardor for Tele Atlas did not work out quite as predicted. Just why is TomTom’s stock priced so low?

Last February an article in Reuters by Roberta B. Cowan, indicated that TomTom’s key markets were declining so rapidly that TomTom might not be able to shift into new sources of revenue fast enough to avoid calamity. The article was based on a surprise earnings announcement weeks earlier which state that TomTom’s earning would not grow in 2011 and that the PND market would decline over 10-15 percent during the same time period.

It is my opinion that the crux of TomTom’s denial of an acquisition of Tele Atlas is based on three issues. First, TomTom has so closely integrated Tele Atlas that the mapping company no longer exists as a separate entity. Second, TomTom’s business is content, but that content is integrated and involves maps and traffic. Third, TomTom has killed off most of the mapping capabilities that Tele Atlas once had and replaced these compilation skills with updating through MapShare, their PND resident software that allows users of these devices to anonymously report road corrections and the paths of their vehicles during their daily driving. In essence, Tele Atlas could not effectively update its maps without the data feeds from the TomTom devices. In sum, TomTom sees itself as a tightly integrated enterprise focused on content and not a loosely integrated group of operating companies with varied interests. Is being tightly integrated kind of like “putting all of your eggs in one basket?”

It does not require a “ThoughtWatt” (big step) to realize that as TomTom’s PND market declines, the value of Tele Atlas as a mapping company declines even more precipitously. In essence, as fewer PNDs are sold, fewer MapShare corrections will be generated and less traffic will be reported for traffic records. So, the question that was posed to Taco should have been “Is TomTom in discussions with an acquirer?” The answer to that question would be “Yes”, although in corporate speak it would likely be phrased something like this “TomTom is a viable entity forward integrating into the in-dash vehicle market and the market for live traffic.”

I presume that while Taco is reciting the real corporate version of the last statement, he will also be thinking this, “…and any other market that Apple wants to be in.” Well, it may be that some other company that is pursuing TomTom.

However, whether anyone should be interested in TomTom will rest, in part, on how much TomTom has damaged the ability of Tele Atlas to compete in the world of navigable map databases. Note that the “damaged goods” issue may be given less strategic attention if the buyer thinks they have a process that would ameliorate this problem (which Apple does). Conversely, the “damaged goods” issue is always at the heart of potential buyer’s valuation target, while the buyer’s strategy is rarely revealed in any meaningful fashion.

The notion that someone would consider buying TomTom for its assets (Tele Atlas, MapShare, navigation software and the company’s traffic data) is contingent on due diligence to determine whether there is real value in the asset. The value could be in the quality of the data or it could be in the time that buying these data might save someone who needed to have a database of navigation quality to support its growth goals. The company’s traffic assets may or may not be important in this case.

To me, the valuation issue is a very interesting question and one that does not have the same answer that it had in 2008 when Tele Atlas was acquired by TomTom and NAVTEQ by Nokia. I suspect that the fundamental issue may, as noted earlier, have something to do with how badly TomTom and Nokia have stumbled since these acquisitions, but it is likely that the important issues will have more to do with crowdsourcing, citizen mapping and the compilation of anonymous GPS locations and paths to extend and augment the navigation database. Of additional interest is whether these databases will be more focused on attributes of location than navigation?

Hmmm. Sounds like something to dig into next time. Has the market for mobile location services changed the future of mapping? Will GPS paths replace field research, and what role will data quality play in answering the valuation question? More, next time.

By the way, I have decided to spend a couple of days at Where2.0 next month in Santa Clara, CA and hope to see you there.

Click for our contact Information

Bookmark and Share

Posted in Apple, Mike Dobson, Personal Navigation, Tele Atlas, TeleAtlas, TomTom, place based advertising | 1 Comment »

OSM vs. the Mechanical Turk – A New Option For Mappers?

February 8th, 2011 by MDob

“I wrote Muki Haklay to ask a question about his recent, interesting paper titled “How Many Volunteers Does It Take To Map An Area Well? The validity of Linus’ law to Volunteered Geographic Information”. You can download a copy from UCL Discovery.

I asked Muki if he thought that the error signatures he discussed in the article could be a function of the variability between different GPS devices or, potentially, reflect the variability in the GPS readings from specific devices located in different positions within a vehicle (car, bicycle, skates, etc.). Muki responded that he felt the accuracy issue may be more related to the quality of the aerial imagery that the volunteers use, noting that imagery and not GPS receivers are the main source of OSM data (at least in the UK). He indicated, in his note to me, that in the OSM data he studied for the United Kingdom “… the positional accuracy is derived from the quality of the orthorectified imagery.”

I pondered Muki’s response, as it opened another door for me in how to think about Volunteered Geographic Information. I had conceptually linked crowdsourcing and VGI with User Generated Content, believing that those who participated in these activities were somehow contributing local knowledge to the solution of problems that were essentially geographical. I guess that when I thought about the term Volunteered Geographic Information, I made the mental leap that these volunteers were providing content in the form of spatial information reflecting the geographic areas in which they lived or with which they were more than casually familiar. It has now occurred to me that there may not always be a direct beneficial relationship between geographical knowledge and Volunteered Geographic Information.

Of course, this raises the question of whether or not various aspects of geographical information can be input into a system by different contributors and then harmonized to produce beneficial results. Joe digitizes the streets from aerial imagery, Jane attributes the streets with names, Bob adds addresses, Mary contributes Points of Interest and Blotto QCs the results. Based on my experience, I have found the method of sequential editing to work well in the compilation systems used by commercial map database vendors, where the team is structured and incented to perform the specific work assigned to them based on completeness measured by formal quality assurance methods within a specified schedule. But does this division of labor work well in a system like OSM where the workflow process is not managed to ensure the completion or comprehensiveness of a specific temporal goal for coverage?”

Certainly there have been examples of the success of this type of collaboration working for OSM, as can be found in their superb accomplishments in Haiti, Gaza and Baghdad. However, it is possible that these examples are exceptions, rather than common practice, whose results were accelerated by the humanitarian emergencies involved. Might the division of labor in OSM’s UK database result in data quality and completeness variations that preclude the use of the data across spatial extents? Since crowdsourced databases are considered to be self-healing over time, the logical question may be, “Can you know when the database or a segment of it is acceptable for some use?” Unfortunately, the crowdsourced system is constantly evolving and people choose to use it based on measures other than overall completeness or fitness for use. Measuring fitness for use is a difficult question to be sure and one that I will return to in a future blog on crowdsourcing.

More troubling to me is the possibility that, in some cases, applications that are described as examples of VGI or crowdsourcing may in fact be examples of the concept of the Mechanical Turk and devoid of any geographical expertise provided by the contributors. Before some of you erupt in shouts of “heresy”, read the rest of the blog, as this is really an interesting and thought provoking topic.

Mechanical Turk?

The original 18th century Mechanical Turk was billed as an automaton chess player seemingly driven by a box of gears and dressed in Turkish garb. Although successfully defeating capable chess players, the Mechanical Turk was a hoax, as a human chess master, concealed inside the apparatus, operated the machine.

In today’s world, the Mechanical Turk, billed as “Artificial Artificial Intelligence”, is Amazon’s automated marketplace for work in which requestors ask candidates to perform individual tasks, called Human Intelligence Tasks (HITS), which are defined as task requiring human intelligence to solve. Amazon’s Mechanical Turk is a web service that provides a large network of humans with computers to perform tasks of interest to requestors. The requestors can choose to approve completed HITS before paying for them or auto-approving sight unseen. You can find out more details of Amazon’s version of the Mechanical Turk here.

When I looked, there were 92,767 HITS available, in case you are looking for something to do. Try this link to find HITS). An example of a relevant HITS that I copied from the Amazon Mechanical Turk website follow

Find URLs to Department Store hours of operation

Instructions

Thank you for accepting this task! We need your help to improve our knowledge about Department Stores across the United States.

For each Department Store location, please find a link to the hours of operation (opening and closing hours) on the business’s official website. This information will be used to improve information in maps, websites, and mobile devices (phones, GPSs, etc.). This is what we need you do to:

Hours of Operation:
Many stores list their hours of operation online. We want to capture a URL/link to a page on the business’s official website that shows its hours of operation (opening and closing hours). The examples below demonstrate what these pages might look like for three different department store chains:”

Performing this task will net you $0.10 per hit. “

This is clearly a case of someone augmenting their Points of Interest or business listings database using HITS as a data collection method.

In the example above, the persons performing the HITS are not required to know anything about the department stores whose details they are being asked to collect. In essence, the contributors are being asked to create attribute data for department stores that will be found in business listings used in navigation and related mapping databases.

“So”, you ask “What is the relationship between OSM and the modern version of the Mechanical Turk?” If the majority of OSM contributors to the UK database are spending their time digitizing imagery for the UK portion of the OSM database, as opposed to contributing GPS traces and attributes from paths along which they have traveled or know something about, how likely is it that the OSM effort in the UK benefits from local knowledge to the same extent that it benefits from “free” digitizing?

Muki Haklay’s previous finding (Haklay and Ellul 2010)) that there are more unattributed cells in OSM UK database than in the Meridian data set he often compares it to, may just be a reflection of the fact that people are digitizing roads about which they know nothing – operating as if the task were a HIT, but not contributing any local knowledge that might enrich the effort. Based on this observation, we might ask, “Where is the local Knowledge in OSM?” An illustration in Haklay’s (2009) report on OSM in the UK appears to indicated that 50 percent of the data in the OSM United Kingdom Database (at that time) were contributed by less than 30 contributors, raising, for me, the issue of whether OSM is a good example of the transfer of collective geographical knowledge through crowdsourcing, or, perhaps, a “fee-less” example of HITS and the Mechanical Turk.

Is the fact that OSM usage has not made a dent in the fortunes of NAVTEQ, TomTom/Tele Atlas or Google somehow related to this division of labor between digitizers and attributers? Is OSM’s inability to effectively manage its workforce of voluntary contributors to complete coverage in an area further hampered by the lack of local geographic knowledge and is this limitation a fatal flaw?

Haklay and Ellul (2010) appear to indicate that there is a social bias in the OSM database of the UK. Obviously you do need a computer, an internet connection and plenty of time to digitize road networks, conveniences that are not available to all members of society. What that may mean is that the distribution of those empowered to contribute to OSM, at least in the UK, may not mirror the area of coverage targeted for the database. If this limiting scenario is a possibility, then who will attribute data in the areas not populated by those who fit the OSM profile? Hmmm. We may have to write about these issues in detail, but not today.

No, in this blog, I want to focus on the notion that the use of HITS might be a tool that could be used by the commercial map database providers to protect and extend the superiority of their products over the threat posed by crowdsourced databases. After all, the fact that OSM data is gathered by volunteers is, in my opinion, the most significant of the three competitive advantages that OSM has compared to the commercial providers of navigation databases. Of the two other advantages, I note that OSM’s use of open software is not unique, although its mapping efforts are advantaged by the lack of an expense for managerial overhead since OSM is a self-managing entity. It may be that the cost differential between the commercial vendors and OSM could be decreased by using HITS to create portions of a navigation database. Also, since the workers are self-regulated and incented to perform tasks, the management expense of creating the data would decrease in some relationship to the application of the HITS method.

You know, the cost differential could be so significant that it might be time for a new entrant in the navigation database market?

How about this – Assuming this was legal (another topic I will not discuss here), how about running a HITS based on examining Street View and encoding every street sign, address and informative marking visible on and along roads and buildings?. Yeah, I know, Google’s got this covered, but performing this task in a way that would provide valuable information is a task just made for a HITS. Maybe NAVTEQ should try this out with NAVTEQ True now being collected by their dilithium crystal- powered vans?

What if a commercial company adopted the HITS approach to building their navigation database or attributing some of the digitized information already in their database (in fact that could be the case in the example I provided above)? Could the use of HITS be part of solution to building and maintaining a low cost map database? Would people be willing to digitize or attribute digitized lines for a low fee per mile or scene?

Or consider this alternative. Google and others are using various forms of pattern recognition to extract information in imagery, but in certain cases the algorithms have difficulty in returning high-quality data due to problems interpreting the scene. Why not use HITS based on analyzing an image/public domain map duo and have humans unravel the problems that give machines fits? (This is an example of Dobson’s newly patented process popularly known as HITS for FITS)

Yep, there are a number of problems with the HITS approach and issues of quality control and accuracy of response seem to be leading the list. And yes, we could argue about them for weeks. But let’s cut to chase. If you could run HITS at a very low cost, you could afford to do it more than once for a specific location or area and mimic the self-healing error process of crowdsourced systems. Could HITS be cheap enough and good enough to provide a sustainable competitive advantage? I think it could be, if properly managed. And wouldn’t it be fun to find out?

Well, that’s enough thought provoking stuff for today (okay, maybe it was just mildly interesting and not thought provoking, but interesting enough that I may write about it again).

Note – in various ways, this blog benefited from notes kindly sent to me by Muki Haklay (University College London) and Don Cooke (Esri and almost professional “cellist”) as well as a conversation I had with Pat McDevitt, formerly of TomTom/Tele Atlas and now with MapQuest (that’s an interesting move). However, the mistakes, errors in logic, sarcasm, misspellings and lunatic ravings are, unfortunately, entirely mine.

Speaking of lunatic ravings, have you heard about the Google “Whack-A-Mole Rediscovery Project?”

According to the company, the response to any question about an error on their maps is

“Getting mapping right is a difficult challenge and we are working hard to improve our product. And, yes the problem was fixed, but then we found the incorrect information again when we spidered a website that had scrapped the incorrect information from ours and we updated our site to reflect this ‘new’ information. In the meantime the website we spidered had scraped the corrected information recently shown on Google Maps (but now changed), which we will rediscover in a few weeks and change back. Unfortunately, by that time, they will have discovered our current data and change their data again, which we will rediscover and…”

References cited

Haklay, Mordechai (Muki) and Clarie Ellul. (forthcoming). Completeness in volunteered geographical information – the evolution of OpenStreetMap coverage in England (2008-2009). Journal of Spatial Information Science

Haklay, Muki, 2009. Understanding the quality of user generated mapping – comparing OpenStreetMap to Ordnance Survey Geodata, (PowerPoint Presentation) http://povesham.wordpress.com/2009/01/12/osm-quality-assessment-s4-presentation/

Click for our contact information

Bookmark and Share

Posted in Authority and mapping, Google, Google maps, MapQuest, Mapping, Mike Dobson, Navteq, OSM, User Generated Content, Volunteered Geographic Information, crowdsourced map data, map compilation, map updating, openstreetmap | 1 Comment »

Musings – Or You Can’t See There from Here

January 28th, 2011 by MDob

Last week Duane Marble sent me two news items that you might find of interest. The first item included a link to a short piece on someone’s observation that the major online map websites did not show the correct location of the football stadium that is to host the Super Bowl. About an hour later I got around to reading the article at Directions Magazine, but I found a comment indicating that the stadium was correctly located on all major web mapping sites. Either the first story was incorrect, or the maps were updated once the scandal was made public. But the interesting issue is how would you determine which possibility was, in fact, true? As far as I know, no major online mapping site has installed the infamous, but highly useful “time-machine” button.

The second item of interest described an activity involving the USGS scanning and georeferencing historical USGS quadrangles. The activity was described as follows:

“The USGS Historical Quadrangle Scanning Project (HQSP) is scanning all scales and all editions of approximately 250,000 topographic maps published by the U.S. Geological Survey (USGS) since the inception of the topographic mapping program in 1884. This scanning will provide a comprehensive digital repository of USGS topographic maps, available to the public at no cost. This project serves the dual purpose of creating a master catalog and digital archive copies of the irreplaceable collection of topographic maps in the USGS Reston Map Library as well as making the maps available for viewing and downloading from the USGS Store and The National Map Viewer.”

Receiving these notes resurrected a concern that I have batted about over the last decade involving the archiving of cartographic product in an online environment. Simply put, “Can online maps continue to fill the role served by paper maps as a historical resource?”

The “establishing a past state of a mapping database” problem first reared its head in my life when I was working in the world of commercial cartography. We had converted our cartographic renderings that had been prepared for printing based on manual map preparation technologies (scribing on negative film, interposed screens, camera work, etc.) to digital technology (digitizing, software manipulation of spatial data and output on film written by laser scanners). However, we continued our long held practice of storing printed copies of our products for purposes of copyright certification, as well as in response to a blend of other requirements. Among these ancillary concerns was the need to be able to document how our maps had represented some feature at a previous point in time. Usually these types of requests were associated with defending the company from lawsuits, for example, in which a party was claiming that one of our products misrepresented some geographical feature crucial to upholding the assertions underpinning a lawsuit.

The issue of interest in the last example was that our products (paper maps) were produced from product databases that were extracts of our Master Database. While we could and did archive copies of our master database based on legal requirements (e.g. assets guaranteed in financial transactions), recovery concerns, and general best practices, the number of product databases soon grew at a rate that outstripped our ability to finance the time, media and off-site storage expenses of archiving every edition of every product. Conversely, we were quite easily able to archive the paper products. In addition, our customers were quite easily able to archive their copies of our paper products, in case they wanted to examine what had changed in a particular location from one period of time to the next edition of the product. Do any of you have complete copies of Google’s U.S. map base from when it was first introduced?

A few years later, I was conducting some expert witness research in respect to a patent case and, as part of my efforts, I wanted to determine when a specific digital mapping application was first launched on the Internet. I managed to find the team responsible for the product, but not one of them could remember the date when the functionality was first “stood-up” and pushed out to the organization’s “live” site on the Internet. While on another patent case, I desired to find out when a specific functionality was made available to the public on a website that provide mapping services. Unfortunately, no one on the team responsible for the website remembered when the functionality was introduced. I should point out that the people I hoped could answer the questions I was asking were not involved in the litigation, nor would they be impacted by the results of the action – they simply did not record a history that told them when they launched or revised their online products. This lack of a formal “corporate memory” related to spatial databases and mapping functionality will surely be regarded as a significant lack by historians looking back on this era and trying to figure out who did what, when, where and how.

Just this last week someone from Providence sent me an image of some interesting public artwork that was being destroyed, as it was on the face on one of the former I-195 overpasses that are in the process of being deconstructed. Yes, this is an overpass that NAVTEQ and Google currently show traffic flowing across, even though this Interstate route has not been open for traffic for over one year. However, in trying to recreate the varying geometry used by the online mapping services to represent the interstate road network in Providence over the last year, I had no place to turn. If a historical record of mapping database changes exists, it appears not to be publicly available from either the providers of the online mapping systems or from their navigation database providers.

My interest is that I was trying to conceptualize what methodological changes need to be introduced to help insure that those managing map database systems could preclude these types of senseless errors. However, the process led me into a consideration that the focus of online mapping systems on the rapid, near-real-time presentation of map changes is causing us to lose the capability of tracking spatial changes across time and potentially impacting our ability to research some aspects of change detection. (After all, where would mystery novels be without the super sleuth suddenly realizing that Skyline Drive used to be called Beacon Road at the time of the kidnapping forty-years ago?)

While imaging systems capture everything that can be resolved in the image scene, they do so at the expense of the attribution of the elements that are sensed. For example, although it is easy to spot the difference between a large road and small road on imagery, it is not feasible to identify one linear structure as Interstate 95 and the other as Woodall Road, at least not on the basis of the imagery alone. Further, aerial imagery does not provide information on spatial features that are not part of the visible spectrum (such as legal boundaries, zip codes, census tracts, etc.). Maps serve as a source of compiled information that is made sensible through the icons, marks, and other symbols assigned to features by cartographers. Maps are a medium well suited to help us assess real world features, such as street names and boundaries, that may have changed across time, as well as allowing us to note where these changes occurred and how they may be related to other spatial data. However, to support this use, multiple editions of the map (or photo-map if that is your preference) must be available that provides coverage of the area of interest.

Maps serve more uses than location reference and it is my fear that many important uses of maps have no future in the rapid update, image-oriented, online world of maps and mapping. While the USGS can digitally Xerox all of the editions of its quadrangles dating back to 1874, they are able to do so because they have retained the printed editions of their topographic maps. Conversely, how can we know how Community X was represented on Google Maps two years ago? Or how the road geometry of my neighborhood was represented twelve years ago on, say, MapQuest?

Google, for example, clearly understands the value of archival information and provides historical imagery in Google Earth in an attempt to respond to this need. When viewing London in Google Earth, the application provides imagery from several periods between 1945 and 2010, selectable on a slider bar and displayed as an overlay. Somewhat curiously, Google insists on showing modern 3D-models, in place of historic buildings as they existed in 1945. However, the imagery from that period was of poor quality compared to what we expect today and the truth is that the city could be better represented by scans of paper maps that showed the geometry and names of London streets, icons of buildings and the POIs as they were named at these points in time.

When I thought further about the problem, I realized that we were able to solve the map comparison problem in the past, in part, precisely because our maps were not integrated into a master database spanning the extent of all of the geographical areas we could research and compile. Indeed, in the past the problem was just the opposite. I remember various news articles on researchers who were leasing warehouses and using the floor space to fit together the topographic sheets for a region into a master database of sorts. Of course the common complaints were that the paper maps were not dimensionally stable, the map seams did not butt as smoothly as had been hoped and the projection seemed off in the case of a curious quadrangle or two. Fancy that!

I guess time and the future provide new opportunities and new reasons to re-investigate the old problems we thought would be solved in the future. Could we put together a useful, workable archive of online maps that could be used for purposes of analyzing historical change detection across long periods of time? Something to think about, I guess. But when I contemplate how to organize and archive those 15 minute updates now used by some online map databases and make them accessible to the world, I get a headache.

Speaking of headaches – I have reservations for travel to Cairo on February 16th followed by a Nile River Cruise, a flight to Abu Simbel, a cruise across Lake Nasser to the Aswan Dam, followed by visit to Petra, Jordan returning home early March. I guess I should call the UN and warn them whenever I plan international travel. Well, there go my plans for a trip of a lifetime. On the other hand, my loss pales in comparison to the plight of the heroic citizens of Egypt who are willing to risk their lives for an opportunity for a better life. My thoughts and prayers are with them.

I guess I will just have to focus on blogging a bit more often to fill in my new wealth of spare time. In any event, whenever the next blog is published, I am going to write about the concepts of the Mechanical Turk and User Generated Content in mapping and ask which is which.

Click for our contact Information

Bookmark and Share

Posted in Authority and mapping, Categorization, Geospatial, Google maps, MapQuest, Mike Dobson, Navteq, User Generated Content | 1 Comment »

Great Book On OSM – Bad Review for NAVTEQ

January 12th, 2011 by MDob

Slightly before Christmas, courtesy of one of the authors, I received a copy of “OpenStreetMap, Using and Enhancing the Free Map of the World” by Frederik Ramm, Jochen Toph and Steve Chilton. Published with a 2011 copyright by UIT Cambridge LTD, the manuscript is an indispensable guide to those interested in harnessing the power of OpenStreetMap (OSM) for mapping, navigation and location services.

Frederick Ramm and Jochen Topf are the main authors of the book, which was originally written in German. They expanded and translated the original edition to produce an internationally-oriented English language edition which was further edited by Steve Chilton. Ramm and Topf have worked in the IT industry for over a decade and with OSM since 2006. Among their many accomplishments is the founding of Geofabrik GmbH, which provides products and services based on OSM and other open geodata sources. Frederick Ramm may also be known to those of you who follow the Legal-Talk mailing list focused on OSM. Steve Chilton is currently the Chair of the Society of Cartographers, as well as the Educational Development Manager at Middlesex University in the United Kingdom.

I was unsure what to expect when I picked up this weighty tome of 335 pages comprised of 27 chapters, 32 color plates, numerous diagrams and an appendix that follows-up on topics not detailed in the main text. After having read the book, I think of it as “OSM – The Missing Manual”. If you want to work with OSM by contributing data, making maps, writing OSM related software, or simply want to know the details of how this crowdsourced mapping system works, this is the read for you.

The book is divided into four sections and starts with an introduction to OSM and its community of supporters. The second section focuses on the use of GPS devices to provide data, how to upload data to OSM, and concludes with how to edit OSM data using a variety of software packages. The following section discusses making and using maps and routes from OSM data, focusing on numerous rendering engines. The section ends with a brief summary of licensing issues, a topic that I will return to later. The final section is a wide-ranging discussion of “hacking” OSM, aimed at developers and hackers interested in exploring the ins-and-outs of the OSM database server, as well as advanced editing. For those of you hoping this book would provide a spirited discussion of crowdsourcing and its use for map compilation, you should know that the authors are true believers. I doubt the question is of interest to them or to their intended audience.

As noted above, I found the book to be very informative, but will note here that reading it is tough sledding if you have no intention of trying to use the tools and techniques described. However, people with that mindset are not the audience for whom the book is designed. Instead, this is the ultimate read for those who are interested in contributing to or using OSM and in that role it is an excellent introduction.

Lest my readers think I’ve gone soft, there are a number of concepts described in the book that I cringed while reading. On page 225 the authors write that “There is no clear distinction between navigation and route planning.” I had always thought that route planning (calculation of a route based on attributes of a transportation network) and positioning (obtaining a relative position and orientation of a vehicle to the transportation network with respect to the data representing the real world) were combined in navigation, but these and other minor issues are likely of little importance to the intended audience of the book.

Where I thought the book a bit off-target was in the chapter titled “License Issues when Using (OSM) Data.” At the start of the chapter is an inset note indicating that the authors are not lawyers and that the “…chapter documents community practice or the reasoning of the authors. If in doubt, you should contact a lawyer.” To me this appears to be the written version of “I’m not a lawyer, but I play one on Television.” Any commercial firm that has an interest in using OSM data should consult a lawyer before they initiate any work on a product or project involving the use of OSM data. Overall, however, this chapter appears to provide a reasonable overview of the current CC by SA and the proposed Open Database License ODbL. Some of the conclusions reached by the authors may reflect their desire or that of the user community on how the licenses should be interpreted, as opposed to how they might be interpreted by the legal community. (I learned many years ago (shame on me) that what you as a lay person thought was reasonable and commonly agreed, ceased to be either when there was money and lawyers involved.)

In sum, this is a book deserving of a place on your bookshelf. It is well written, comprehensive and worth your time, especially if you want to find out how to do things with OSM and don’t want to spend days trying to find the topic in the OSM Wiki. Even if you do find the topic there, I think you would prefer to read the description in “OpenStreetMap, Using and Enhancing the Free Map of the World”

Now back to beating NAVTEQ

Once upon a midnight dreary, while I pondered, weak and weary,
Over many a quaint and curious volume of forgotten lore–
While I nodded, nearly napping, suddenly there came a tapping,
As of someone gently rapping, rapping at my chamber door.
`’Tis some visitor,’ I muttered, `tapping at my chamber door -
Only this, and nothing more.
But should delight them…
with a NAVTEQ map of
…Providence, Rhode Island?

Anyway, late the other night I received an email from a contact who spilled the beans. As hard as it is to believe, NAVTEQ has once again decided to route people over the now being deconstructed, no longer available to traffic portion of I-195 that they finally managed to eliminate from their database last year. The two images below were screen captured from the NAVTEQ corporate website on January 10, 2011 (and the geometry has not yet been changed as of the publication data of this blog).

NAVTEQ decides to show the wrong road, again.

I now realize that the reason that they would not give me a ride in one of their new vans (the ones with the dilithium crystals) is that the sensors in these units have yet to be calibrated. If you look at the map above, you will notice that not only has the “closed road” been reopened by NAVTEQ, but they appear to be tracking a normal traffic flow on it. How does that work?

One possibility is that NAVTEQ forgot to time slice their traffic data and the years and years of recorded traffic is of such a volume that the traffic on the recently reconfigured section is not strong enough to pull the “established” path to the new geometry. Wait, wait. I know. Maybe their new crowdsourcing capabilities informed them of this change. Unlikely. However, either explanation would not apply if they had driven the new alignment with either their new or older field research vehicles. So, what we seem to be left with is that NAVTEQ makes changes to their database that are not field verified. Hmmm, after all of these years of telling us that this was what distinguished them from Tele Atlas, we now how some indication that this was just marketing speak. Bah, Humbug. Is there no decency in the world of mapping? But just to emphasize my shock at what NAVTEQ has managed to do, look at the next figure.

NAVTEQ includes now non-existent signs to show destinations that cannot be reached by this route

Wow, those new dilithium powered vans are incredible. Not only can they travel on elevated highways that are being torn down, but they can digitize and photograph road signs that no longer exist. Amazing. Simply amazing.

Click for our contact Information

Bookmark and Share

Posted in Mapping, Mike Dobson, Navteq, OSM, Tele Atlas, User Generated Content, Volunteered Geographic Information, crowdsourced map data, openstreetmap | No Comments »

« Previous Entries