Exploring Local
Mike Dobson of TeleMapics on Local Search and All Things Geospatial

If the POI Data is So Good, Why are Search Results so Bad?

June 15th, 2007 by MDob

Did you know that the top 5 POI categories in the U.S. (according to infoUSA) are as follows?

1. 779,761 Doctor’s offices
2. 516,234 Restaurants
3. 373,655 Places of Worship
4. 255,166 Beauty Shops
5. 179,088 Auto repair shops (Curiously, there are more of these than there are gas stations.)

This is the sort of stuff that could lead you to give up hope in the U.S. or die laughing about the spurious correlations these data could cause a conspiracy theorist. For example, it appears that cars may “get sick” more often than people. Or directionally, people repair their cars, fix their hair, pray, eat and, then, need to see a doctor. All this is really meant to say is that you need to look beyond the obvious statistics to understand most situations. POIs/business listings databases are a good example.

A friend wrote me an email responding to our blog on infoUSA and indicated that while the data I presented about infoUSA might be true, it did not help him understand why local search functionality often returned erroneous listings. So, why do local search results show difficulty in finding a business listing that satisfies the requirement for a specific type of business in the appropriate neighborhood?

Specifically, I am interested in searches:

1. Where a business, whose location we already know, is not returned in a name or category search of the area in which it is located.

2. That provide search results including companies no longer in business or perhaps no longer in business at the address provided.

3. That provides search results for businesses that never existed at the location returned. Our first image shows the location of a Costco detailed by a local search provider. The second screen image is from Costco’s own store locator, which indicates they do not have an outlet in the location specified.

Search result indicating the location of a Costco that does not exist, according to Costco

Figure 1.  A phantom Costco shown by a provider of local search

A screen-shot of Costco's store locator, which indicates Costco does not have an outlet at this location

Figure 2.  Costco’s own store locator indicates the store does not
exist.

Here are some of the many reasons why local search results are often less than desirable.

1. Not all of infoUSA’s data are included in its basic business listings extract and the same is true of the other major data providers. If you want augmentation you have to pay for it. To be honest, infoUSA is one of the “higher-priced spreads” and some search providers may have concluded that the potential market gain from using more accurate data is not mandated by the increase in price to license the data.

The underlying issue here is that many service providers are unwilling to pay for high-quality data because they are unable to specifically quantify the benefits. While cheaper is always enticing, it is not always the best solution for your customers. From the standpoint of the data providers, if they cannot recover the cost of their data gathering and QA functions, they will lean to creating sub-standard data. This vicious cycle plagues all data supply industries.

2. Compiling and fusing a database of 14 million businesses with other listings data and search functionalities is not a trivial task for the local search/IYP providers. Our research indicates that that several search providers have fallen behind the update cycle of their data providers. (By the way, this is also a common problem with integrating updated map databases).

3. Some search providers integrate data from several providers of business listings (especially for the “trades”, such as construction) and fuse the data without filtering the various data sources for duplicates. Unfortunately, duplicate data listings are often similar but not distinct enough to eliminate without considerable analytics.

4. Some search and ITP providers allow business owners to create business listings online, although the companies rarely have staff dedicated to managing and maintaining these “custom listings”.

5. Most business listings providers “scrape” Yellow Page and White Page directories to create their database of business listings (yes, other sources are used but these are the main ones). Although it sounds counter intuitive, not all companies advertise in the Yellow Pages or have their business listed in the business White Pages. For example, you will not find our company, TeleMapics, in either the White or Yellow Pages (while our coordinates are local, we market ourselves to national and international companies).

6. Not all business listings get checked or thoroughly vetted. Although we suspect that every data providers spends some time updating and augmenting their data, keeping these data up-to-date and comprehensive is a game of WhackAMole – you can never quite beat the errors out of the data. However, some providers clearly do a better job than others.

7. Even the best data providers have their weaknesses. For example, when I was the CTO of local search company focused on publishing business listings in cellular formats, I received several emergency calls from our data provider (one of the leading companies) requesting us to yank incorrect listings due to a “Cease and Desist” order issued by a court. Most often, the suit was based on some pour soul receiving phone calls about a business that went bust three or more years before!

8. Categorization. Ugh, this is really an ugly one that I will discuss next time. The problem involves the mismatch between the search provider creating categorizations that reflect how they believe their customers search and the translations of these categories to the SIC/NAICS codes used by the business listings providers to classify the businesses in the first place.

A final note: Everyone is telling me our blogs are too long. Ok, Ok. I will spend the weekend repeating the word “concise”.
Thanks and have a good one.

Bookmark and Share

Posted in Data Sources, Geotargeting, Local Search


(comments are closed).