A several weeks ago we posted a few thoughts about the death of the ZIP code. There’s a lot more to say from the geo-perspective on local search, and here’s some more fodder…

To give any data a geographic context, it must be spatially-referenced to the Earth. Geographic information systems (GIS) serve as a means of referencing this information. Within the context of local search, addresses, city boundaries, postal codes or other geographic data must be ‘translated’ from human terms (690 Fifth Street, San Francisco) to latitude and longitude, ie, machine terms (37.775429, -122.397314). This geocoding process allows databases to recognize human-language requests. To geospatially reference (say) a postal code, one would expect that area to be spatially-defined. When a user searches for (say) “coffee in 94107,” the ZIP code should serve as the geographic constraint, searching within this polygon. Correct?

Wrong! A variety of reasons are to blame for why the logical doesn’t happen: most obviously, ZIP codes were defined as letter carrier routes. They were not meant to serve any other purpose. As such, the ZIP may not even conform to what you expect–one side of a street, one floor of a multi-story building or one-half of a block may not be fall within what postal code you expect. In fact, many parties claim to use a ZIP code database in fact obtain this info from a sister governmental agency, and these boundaries are stylized representations of the USPS data.

More to the point, these stylized boundaries are likely not used. Instead of associating (say) 50 latitude/longitude points to define a the postal code boundary, technical optimization says one point is sufficient. The analogy here is reducing a novel to a word–in the context of local search, granularity matters, and using the mathematical center of a polygon serves to distort and misinform a user’s search. In practice, the centroid is used because it is more efficient to calculate than the actual shape. Reducing the contours and nuances of a small area to a point, often with a radius drawn around it, effectively makes all postal codes look like circles. Gaps and overlaps are formed, further distorting the expected reality for a user.

Graphically this can be represented with the ZIP code boundary and circle (with the center serving as the centroid). The circle includes area that is not shared with the postal code area and vice versa. A user searching in this ZIP will therefore not be returned all the relevant listings. Some will argue this is a technology issue, but from the above example, it clearly more of a mindset–getting product managers to think about the how and why of data will go a long way.