About Geocoding

A geocoded table is a table where every record has a location given in latitude and longitude, using standard decimal degrees notation for the latitudes and longitudes. Unless each record is geocoded with a latitude and longitude location Manifold can not know where that record is located. Once a table is geocoded it may be used to create drawings.

images/sc_tv_editcoords05.png

For example, the table above is not geocoded. It lists the names of towns in the United States but there is no way to tell from the table exactly where the towns are located. If we were to try to draw points on a paper map for each town we would not know where to place the points. If a table is not geocoded, it cannot be used to create a drawing in Manifold either, because Manifold also would not know where to put the points.

images/sc_tv_editcoords03.png

In contrast the above table is geocoded. Each record now has a latitude and longitude location given in decimal degrees notation. We could use the latitude and longitude values to draw a point for each town on a paper map of the United States. Manifold could also use this table to create a drawing.

images/sc_tv_editcoords02.png

If a table is geocoded it can be used to create a drawing, which in turn can be used in a map like the illustration above. Right away, the positions of the towns convey an immediate visual impression of their locations that one does not get in a table presentation. That's part of the great power of a GIS like Manifold so obviously everyone who has important data and a GIS package would like their data to be geocoded so the data can immediately be displayed visually within the GIS.

The problem is that a lot of the important data sets we deal with, whether they are lists of customer addresses or lists of oil wells or lists of fire hydrants that need maintenance, are not geocoded. The central problem for many GIS users is getting their data geocoded. Depending on the contents of the table, geocoding the table can be a reasonably straightforward process or it can be very difficult or even impossible.

Let's take a look at three geocoding tasks to see what approaches to geocoding are possible in different cases. We will look at geocoding a table of towns, geocoding a table of fire hydrants and geocoding a table of street addresses.

Geocoding a Table of Towns

Suppose we have a table with town names like the first example above. How can we geocode it? In the simplest case we look up the latitude and longitude of each town in a reference book or atlas and we add the latitude and longitude to the table for each record by hand.

If we have a table somewhere that already lists latitudes and longitudes for towns we could, of course, extract information from that table automatically. If we have another database table that contains the town name and latitude and longitude we could use SQL facilities such as Union to combine the two tables via a relation using a key field such as the name of the town.

If we have a drawing that shows points for cities in the United States (such as a drawing of populated places) we can geocode the table using the drawing as a guide with Manifold's spatial Match tool. See the Spatial Geocoding with Match topic for more on this tool.

Because there are many geocoded tables of populated places that are easily obtained by free download from the Internet it is usually a straightforward matter to geocode a table of place names.

If we have the Manifold geocoding database installed, we can also geocode using the town name by following the fast and simple procedures in the Street Address Geocoding topic.

Geocoding a Table of Fire Hydrants

Suppose our town would like to create a GIS database of all fire hydrants in the town. We plan to use the power of GIS to help keep track of the status of all fire hydrants and to help plan regular maintenance, cleaning, flushing of water systems and so on. Let's say we have inherited a database of fire hydrants that provides an identification number for each hydrant, some status information on the hydrant and a "location" field that consist of a text comment noting what street the hydrant is on and what is the nearest crossing street. Our task is to geocode the table with the latitude and longitude location of each hydrant.

In the United States, the simplest way to accomplish this task is to connect a portable, WAAS-enabled GPS device to a laptop loaded with Manifold, turn on the GPS Console and then drive to the location of each fire hydrant. With Instant Data turned on we would place a point at the location of each fire hydrant and write down the identification number of that hydrant. The result of this process will be a map of points where each point is the location of a single fire hydrant. In addition to the object ID field, the drawing's table will have only one data attribute field in it, the identification number of the hydrant. We can then use this drawing together with Manifold's Match tool to geocode the hydrants database table using the identification number fields as key fields.

Although recording the locations of fire hydrants in an entire town in this way requires a substantial amount of driving, the process goes very rapidly when the GPS Console and Instant Data are used. WAAS-enabled GPS devices can achieve 2-meter (about 6 feet) accuracy, which is sufficient accuracy to locate fire hydrants.

Unfortunately, WAAS is generally available within the United States only. In regions outside the United States GPS devices will provide only 15-meter (about 50 feet) accuracy by default, which many people would not consider sufficient accuracy for mapping fire hydrants. 15-meter accuracy may be fine for locating bridges, which are large structures that are easily found when one is positioned within 15 meters of them, but in the case of smaller objects such as fire hydrants, especially if they are to be placed on digital maps in relation to features such as buried pipes, one normally would like better accuracy.

One way to accomplish this geocoding task outside of the United States would be to drive the city streets and manually mark on a paper map the locations of all hydrants. We could then scan the paper map, georegister the resultant image and then use Tracing to create a drawing of points that show the location of each hydrant. We could then enter the identification number for each hydrant into a data field for each point and then once again use Match to geocode the database table using the identification number as a key field.

Another alternative might be to acquire an aerial photograph of sufficient resolution that hydrants are visible, to scan in the photograph, georegister it and then use tracing to create a drawing of hydrant points and Match to geocode based on the drawing. Although overhead photography is probably not very practical in the case of fire hydrants (which would be obscured by trees in many cities) it is a very practical way of geocoding other infrastructure items, such as bridges or electrical transmission towers.

Note that the task of geocoding a table of fire hydrants is directly analogous to the task of geocoding a table of oil wells, a table of monitoring stations in a forest or, for that matter, any table of items whose location is not known. In all such cases we must determine the latitude and longitude location of each item by either physically measuring the latitude and longitude with a GPS, by marking the location accurately on a map or by determining the locations using an aerial photograph. If the items to be geocoded are easy to reach and a GPS is available the geocoding process might be very straightforward. If they are far away and there is no aerial photograph or other map that can be used, then it could well be impossible to find their locations and thus geocode the table.

Geocoding a Table of Street Addresses

If we had a table of street addresses like the one below we could not plot these on a map because the table is not yet geocoded. Without a latitude and longitude location for each record we would not know where to place it on a map.

images/tbl_sushi_addresses_01.png

It is easy to make the conceptual mistake of thinking of a street location as being an exactly defined location, the same as a latitude/longitude location. However, that mistake arises mainly from how people use addresses to find locations for the delivery of mail or to go to a particular restaurant or other location. Street addresses, of course, do not really convey an exact latitude and longitude for the address. They simply provide a means by which a postal carrier or someone else physically traversing the streets can find a particular address. To find an address we have to find the street (with the help of a map if we don't know a particular town), orient ourselves to the address system used on that street and then locate the address. As anyone who has tried to find an out-of-sequence address in an unfamiliar town knows, there is a great difference between hunting down a particular street address and going directly to a latitude/longitude location.

images/dwg_sushi_addresses_01.png

It is one thing to be able to find a given street address by physically going there (perhaps with the help of a local street map) and it is quite another thing to plot a table of street addresses, such as the table of restaurant addresses, on a map as seen above without ever going to the actual address. To plot each restaurant shown in the table we need to know the actual latitude and longitude address at which it is located. To do that, the table must be geocoded as seen below.

images/tbl_sushi_addresses_02.png

In recent years the adoption of geocoding technology by consumer computer applications (at least in many First World nations) has also encouraged us to think of street addresses as being equivalent to a latitude/longitude location for the purpose of computer mapping. Internet mapping sites allow us to enter a street address, such as "525 Main Street, Carson City," and instantly see a street map with the location of the address marked as if we had provided an exact, latitude and longitude location. Low cost navigation systems that combine GPS technology with built in maps and street address geocoding systems allow us to specify a street address and navigate directly to that location, again, as if we had given exact latitude/longitude coordinates for our desired destination.

As a result, it is quite common for people to expect to be able to enter a street address into a web site or a map and to see a physical location for that address, a sort of "geocoding on the fly." Some applications may give the appearance of taking a list of addresses and displaying them straightaway as points in a map; however, in all cases the software will internally take the intermediate step of using the address to determine a latitude and longitude location for the record. The latitude and longitude location is then used to plot the location of the point.

Software packages use many different strategies to geocode street addresses into latitude and longitude locations. The basic approach is to maintain a large database of streets and address ranges so that the location of a particular address can be estimated from the database. Software that can perform street address geocoding may be built into a GIS package, it may be sold as separate geocoding software, or it may be provided as an Internet web service.

Manifold includes street address geocoding capability for the United States as a built-in capability of Manifold System. The Manifold street address geocoding engine becomes functional when the Manifold US streets geocoding database is installed. If we have installed the Manifold US streets geocoding database on our system we can take a table that contains valid US street addresses and geocode the table to the approximate positions of the address.

However, when using any street address geocoder it is important to understand that the output of the geocoder is an approximate location.

To geocode addresses outside the United States, Manifold includes an option to use Microsoft's MapPoint product as a geocoder for addresses in Canada or in eleven European countries. See the Geocoding with MapPoint topic.

How Street Address Geocoding Works

To geocode street addresses, any geocoding software (including Manifold) must find the address and an equivalent location in a database. However, there is no database anywhere in the world that specifies an accurate location for each street address. To take the United States for example, there is no national database that specifies exactly where all addresses are located. This is mainly because addresses in the US are highly irregular, are poorly documented and change too rapidly for either private companies or government agencies to be able to keep up with perfect accuracy.

The closest approximations to a national database of address locations that exist in the United States are the U.S. Bureau of the Census "TIGER" database and the TIGER/Line data sets derived from TIGER. TIGER/Line attempts to show known roads with address ranges for each road segment. Actual addresses are not noted, but are represented only as a best effort at showing the address range (from lowest to highest address number) that occur in a particular street segment. Most geocoding software in the United States, including Manifold, uses databases that are derived in some way from the TIGER/Line data sets.

Based on data sets like those created by the Census Bureau, geocoding software can be created that compares a record's address, such as "525 Main Street, Carson City, Nevada, 89701" to an internal database of street segment coordinates and address ranges for each segment. For example, after zeroing in to Main Street by using State, ZIP code and City fields, the software can find the right Main Street segment that contains the address range for the address number at hand. If one particular Main Street segment has a high value of 600 and a low value of 500 for the address range on that segment, the software could then reasonably infer that 525 Main Street is located about one fourth of the way up that particular street segment. It could then assign the latitude and longitude of that interpolated spot to the record.

It is important to understand that the geocoding software has no idea where the actual address is located. It simply interpolates the location of the address by making what is hopefully a reasonable guess based on the address range recorded for a given street segment. Clever software can use a variety of strategies to make better guesses, but at the end of the day the results are usually accurate to only within a city block in urban areas and are wildly inaccurate in rural areas. Addresses of the form ""Rural Route 10 Box 82," for example, in a rural area might not be geocodable to within tens of miles if they are geocodable at all.

Creating geocoding software that can accurately assign an exact, non-interpolated location for each individual address requires a database of all addresses and their exact latitude and longitude locations. To support 911 service and other emergency response services, some towns are using GPS equipment to create precise databases that show the exact location of each address in their town.

For information on using Manifold's street address geocoding system, see the Street Address Geocoding topic.

Street Address Geocoding Outside the United States

Unfortunately, the United States is the only country that places large government databases of street address ranges like TIGER/Line into the public domain. In other countries, acquiring a database that shows streets and address ranges for those streets is very costly, and in many cases not possible.

As a result, there are many fewer choices for street address geocoding software outside of the United States. Because the Manifold geocoder is based on public domain government data, Manifold provides no streets database for locations outside the United States. However, Manifold includes an option to use Microsoft's MapPoint product as a geocoder for addresses in Canada or in eleven European countries. See the Geocoding with MapPoint topic.

Geocoded Tables use Decimal Degree Notation

Geocoded tables in Manifold must have valid latitude and longitude fields consisting of degrees from 0 to +/- 180 longitude and 0 to +/- 90 latitude, with partial degrees denoted as decimal fractions. A minus sign denotes West longitudes and South latitudes. This style of writing latitudes and longitudes is called decimal degrees.

Like all modern GIS packages, Manifold uses decimal degree notation because it is an unambiguous standard that is well suited for arithmetic operations and can be written in database tables as text fields or numeric fields. Older methods of writing latitudes and longitudes, such as the use of the letters "E", "W", "N" and "S" or the use of degrees, minutes and seconds notation are not well standardized and involve clumsy notation that is not very useful in computing operations. Manipulating values such as East 32 42' 15" is somewhat akin to trying to do longhand multiplication using Roman numerals… not very efficient or sensible.

In modern times most databases of geocoded information use decimal degrees. However, over the years there have been many different styles used to write latitude and longitudes in database tables. Older tables might use text fields to express coordinates in the form of degrees, minutes and seconds, for example. Other tables may use degrees, minutes and decimal fractions of degrees. Some tables will denote longitudes in degrees from 0 to 360. Others might use text strings and prefix a letter, such as "N", "S", "E" and "W" to indicate North or South latitudes and East or West longitudes.

Manifold's approach to dealing with such tables is to import them into Manifold where Transform toolbar operators and other tools can be used to convert coordinates into standard decimal degree notation. This allows the full power of Manifold tools to be brought to bear to adjust the table into the desired form. Clever use of token operations will allow transformation of any format into the desired decimal degrees. See, for example, the Using Tokens and Text Strings topic and the Extract Last Names using Tokens example.

If you have a table that has latitude and longitude values using some old-fashioned notation you must first translate those values into modern decimal degree notation. Only then is it a geocoded table.

"Generic" Geocoding Strategies

Geocoding a table by specific addresses is often not required. Although it is easy to understand the conceptual appeal of adding an exact latitude and longitude position for each customer record by address, such geocoded tables also lay a conceptual trap for the unwary in that they are intrinsically inaccurate. Sometimes it is better to have an approximate table that does not lay claims to false accuracy. For many GIS purposes it may be enough to simply pin down a customer location to a specific ZIP code and not to a specific city block. See the Spatial Geocoding with Match topic for spatial geocoding within Manifold.

By spatially geocoding tables using key fields we can often end up with a geocoded table that combines our desired records with a latitude and longitude position for each record. The classic example is displaying customer address records using their ZIP codes. If we have a drawing that shows a point for each ZIP code centroid we can merge the customer address table into this drawing using the ZIP code as a key field. In that case, customer records will appear as a point at the ZIP code centroid for their ZIP code. [The Manifold street address geocoding engine can also geocode addresses that consist only of ZIP codes as well, by geocoding the address to the ZIP code centroid, so as a practical matter there is no need to use Match to geocode to ZIP codes in the US.]

This "generic" method of geocoding is often the only possible method of geocoding for international users who do not have access to a street address geocoder for their locations but who do have postal code or telephone code maps or other data sets that can be used as guides for spatial geocoding based on Match.

See Also

Creating Drawings from Geocoded Tables

Linked Drawings and Geocoded Tables

Geocoding

Street Address Geocoding

Geocoding with MapPoint

Spatial Geocoding with Match

Create a Linked Drawing from a Geocoded Table

Create a Map from a Geocoded Table

Notes

The table of sushi restaurants lists sushi restaurants approximately within one mile (1.6 km) of the main USGS facility in Menlo Park. The map in the illustration shows the USGS facility as a yellow diamond and plots the restaurants as green dots. Once we get over our astonishment at the provinciality of a region that can support no more than eleven sushi restaurants per mile, we can see that the restaurants in Palo Alto in the lower right are more tightly clustered than those in Menlo Park, which tend to be spread out along a single main road.

Back to Manifold Home Page