About Geocoding
A geocoded table is a table where every record has a location given in latitude and
longitude, using standard decimal degrees notation for the latitudes and
longitudes. Unless each record is geocoded with a latitude and longitude location
Manifold can not know where that record is located. Once a table is geocoded it may
be used to create drawings.
For example, the table above is not geocoded. It lists the names of towns in
the United States but there is no way to tell from the table exactly where the
towns are located. If we were to try to draw points on a paper map for each
town we would not know where to place the points. If a table is not geocoded, it
cannot be used to create a drawing in Manifold either, because Manifold also
would not know where to put the points.
In contrast the above table is geocoded. Each record now has a latitude and
longitude location given in decimal degrees notation. We could use the
latitude and longitude values to draw a point for each town on a paper map of the
United States. Manifold could also use this table to create a drawing.
If a table is geocoded it can be used to create a drawing, which in turn can
be used in a map like the illustration above. Right away, the positions of the
towns convey an immediate visual impression of their locations that one does not
get in a table presentation. That's part of the great power of a GIS like
Manifold so obviously everyone who has important data and a GIS package would like
their data to be geocoded so the data can immediately be displayed visually
within the GIS.
The problem is that a lot of the important data sets we deal with, whether
they are lists of customer addresses or lists of oil wells or lists of fire
hydrants that need maintenance, are not geocoded. The central problem for many GIS
users is getting their data geocoded. Depending on the contents of the table,
geocoding the table can be a reasonably straightforward process or it can be
very difficult or even impossible.
Let's take a look at three geocoding tasks to see what approaches to geocoding
are possible in different cases. We will look at geocoding a table of towns,
geocoding a table of fire hydrants and geocoding a table of street addresses.
Geocoding a Table of Towns
Suppose we have a table with town names like the first example above. How can
we geocode it? In the simplest case we look up the latitude and longitude of
each town in a reference book or atlas and we add the latitude and longitude
to the table for each record by hand.
If we have a table somewhere that already lists latitudes and longitudes for
towns we could, of course, extract information from that table automatically.
If we have another database table that contains the town name and latitude and
longitude we could use SQL facilities such as Union to combine the two tables
via a relation using a key field such as the name of the town.
If we have a drawing that shows points for cities in the United States (such
as a drawing of populated places) we can geocode the table using the drawing as
a guide with Manifold's spatial Match tool. See the Spatial Geocoding with Match topic for more on this tool.
Because there are many geocoded tables of populated places that are easily
obtained by free download from the Internet it is usually a straightforward matter
to geocode a table of place names.
If we have the Manifold geocoding database installed, we can also geocode
using the town name by following the fast and simple procedures in the Street Address Geocoding topic.
Geocoding a Table of Fire Hydrants
Suppose our town would like to create a GIS database of all fire hydrants in
the town. We plan to use the power of GIS to help keep track of the status of
all fire hydrants and to help plan regular maintenance, cleaning, flushing of
water systems and so on. Let's say we have inherited a database of fire
hydrants that provides an identification number for each hydrant, some status
information on the hydrant and a "location" field that consist of a text comment noting
what street the hydrant is on and what is the nearest crossing street. Our
task is to geocode the table with the latitude and longitude location of each
hydrant.
In the United States, the simplest way to accomplish this task is to connect a
portable, WAAS-enabled GPS device to a laptop loaded with Manifold, turn on
the GPS Console and then drive to the location of each fire hydrant. With Instant Data turned on we would place a point at the location of each fire hydrant and
write down the identification number of that hydrant. The result of this process
will be a map of points where each point is the location of a single fire
hydrant. In addition to the object ID field, the drawing's table will have only one
data attribute field in it, the identification number of the hydrant. We can
then use this drawing together with Manifold's Match tool to geocode the hydrants database table using the identification number
fields as key fields.
Although recording the locations of fire hydrants in an entire town in this
way requires a substantial amount of driving, the process goes very rapidly when
the GPS Console and Instant Data are used. WAAS-enabled GPS devices can
achieve 2-meter (about 6 feet) accuracy, which is sufficient accuracy to locate fire
hydrants.
Unfortunately, WAAS is generally available within the United States only. In
regions outside the United States GPS devices will provide only 15-meter (about
50 feet) accuracy by default, which many people would not consider sufficient
accuracy for mapping fire hydrants. 15-meter accuracy may be fine for
locating bridges, which are large structures that are easily found when one is
positioned within 15 meters of them, but in the case of smaller objects such as fire
hydrants, especially if they are to be placed on digital maps in relation to
features such as buried pipes, one normally would like better accuracy.
One way to accomplish this geocoding task outside of the United States would
be to drive the city streets and manually mark on a paper map the locations of
all hydrants. We could then scan the paper map, georegister the resultant image
and then use Tracing to create a drawing of points that show the location of each hydrant. We
could then enter the identification number for each hydrant into a data field for
each point and then once again use Match to geocode the database table using the identification number as a key field.
Another alternative might be to acquire an aerial photograph of sufficient
resolution that hydrants are visible, to scan in the photograph, georegister it
and then use tracing to create a drawing of hydrant points and Match to geocode based on the drawing. Although overhead photography is probably
not very practical in the case of fire hydrants (which would be obscured by
trees in many cities) it is a very practical way of geocoding other infrastructure
items, such as bridges or electrical transmission towers.
Note that the task of geocoding a table of fire hydrants is directly analogous
to the task of geocoding a table of oil wells, a table of monitoring stations
in a forest or, for that matter, any table of items whose location is not
known. In all such cases we must determine the latitude and longitude location of
each item by either physically measuring the latitude and longitude with a GPS,
by marking the location accurately on a map or by determining the locations
using an aerial photograph. If the items to be geocoded are easy to reach and a
GPS is available the geocoding process might be very straightforward. If they
are far away and there is no aerial photograph or other map that can be used,
then it could well be impossible to find their locations and thus geocode the
table.
Geocoding a Table of Street Addresses
If we had a table of street addresses like the one below we could not plot
these on a map because the table is not yet geocoded. Without a latitude and
longitude location for each record we would not know where to place it on a map.
It is easy to make the conceptual mistake of thinking of a street location as
being an exactly defined location, the same as a latitude/longitude location.
However, that mistake arises mainly from how people use addresses to find
locations for the delivery of mail or to go to a particular restaurant or other
location. Street addresses, of course, do not really convey an exact latitude and
longitude for the address. They simply provide a means by which a postal
carrier or someone else physically traversing the streets can find a particular
address. To find an address we have to find the street (with the help of a map if
we don't know a particular town), orient ourselves to the address system used
on that street and then locate the address. As anyone who has tried to find an
out-of-sequence address in an unfamiliar town knows, there is a great
difference between hunting down a particular street address and going directly to a
latitude/longitude location.
It is one thing to be able to find a given street address by physically going
there (perhaps with the help of a local street map) and it is quite another
thing to plot a table of street addresses, such as the table of restaurant
addresses, on a map as seen above without ever going to the actual address. To plot
each restaurant shown in the table we need to know the actual latitude and
longitude address at which it is located. To do that, the table must be geocoded as
seen below.
In recent years the adoption of geocoding technology by consumer computer
applications (at least in many First World nations) has also encouraged us to think
of street addresses as being equivalent to a latitude/longitude location for
the purpose of computer mapping. Internet mapping sites allow us to enter a
street address, such as "525 Main Street, Carson City," and instantly see a
street map with the location of the address marked as if we had provided an exact,
latitude and longitude location. Low cost navigation systems that combine GPS
technology with built in maps and street address geocoding systems allow us to
specify a street address and navigate directly to that location, again, as if
we had given exact latitude/longitude coordinates for our desired destination.
As a result, it is quite common for people to expect to be able to enter a
street address into a web site or a map and to see a physical location for that
address, a sort of "geocoding on the fly." Some applications may give the
appearance of taking a list of addresses and displaying them straightaway as points
in a map; however, in all cases the software will internally take the
intermediate step of using the address to determine a latitude and longitude location for
the record. The latitude and longitude location is then used to plot the
location of the point.
Software packages use many different strategies to geocode street addresses
into latitude and longitude locations. The basic approach is to maintain a
large database of streets and address ranges so that the location of a particular
address can be estimated from the database. Software that can perform street
address geocoding may be built into a GIS package, it may be sold as separate
geocoding software, or it may be provided as an Internet web service.
Manifold includes street address geocoding capability for the United States as a built-in capability of Manifold System.
The Manifold street address geocoding engine becomes functional when the
Manifold US streets geocoding database is installed. If we have installed the
Manifold US streets geocoding database on our system we can take a table that
contains valid US street addresses and geocode the table to the approximate
positions of the address.
However, when using any street address geocoder it is important to understand
that the output of the geocoder is an approximate location.
To geocode addresses outside the United States, Manifold includes an option to
use Microsoft's MapPoint product as a geocoder for addresses in Canada or in
eleven European countries. See the Geocoding with MapPoint topic.
How Street Address Geocoding Works
To geocode street addresses, any geocoding software (including Manifold) must
find the address and an equivalent location in a database. However, there is
no database anywhere in the world that specifies an accurate location for each
street address. To take the United States for example, there is no national
database that specifies exactly where all addresses are located. This is mainly
because addresses in the US are highly irregular, are poorly documented and
change too rapidly for either private companies or government agencies to be able
to keep up with perfect accuracy.
The closest approximations to a national database of address locations that
exist in the United States are the U.S. Bureau of the Census "TIGER" database and
the TIGER/Line data sets derived from TIGER. TIGER/Line attempts to show
known roads with address ranges for each road segment. Actual addresses are not
noted, but are represented only as a best effort at showing the address range
(from lowest to highest address number) that occur in a particular street
segment. Most geocoding software in the United States, including Manifold, uses
databases that are derived in some way from the TIGER/Line data sets.
Based on data sets like those created by the Census Bureau, geocoding software
can be created that compares a record's address, such as "525 Main Street,
Carson City, Nevada, 89701" to an internal database of street segment coordinates
and address ranges for each segment. For example, after zeroing in to Main
Street by using State, ZIP code and City fields, the software can find the right
Main Street segment that contains the address range for the address number at
hand. If one particular Main Street segment has a high value of 600 and a low
value of 500 for the address range on that segment, the software could then
reasonably infer that 525 Main Street is located about one fourth of the way up
that particular street segment. It could then assign the latitude and longitude
of that interpolated spot to the record.
It is important to understand that the geocoding software has no idea where
the actual address is located. It simply interpolates the location of the
address by making what is hopefully a reasonable guess based on the address range
recorded for a given street segment. Clever software can use a variety of
strategies to make better guesses, but at the end of the day the results are usually
accurate to only within a city block in urban areas and are wildly inaccurate in
rural areas. Addresses of the form ""Rural Route 10 Box 82," for example, in
a rural area might not be geocodable to within tens of miles if they are
geocodable at all.
Creating geocoding software that can accurately assign an exact,
non-interpolated location for each individual address requires a database of all addresses
and their exact latitude and longitude locations. To support 911 service and
other emergency response services, some towns are using GPS equipment to create
precise databases that show the exact location of each address in their town.
For information on using Manifold's street address geocoding system, see the Street Address Geocoding topic.
Street Address Geocoding Outside the United States
Unfortunately, the United States is the only country that places large
government databases of street address ranges like TIGER/Line into the public domain.
In other countries, acquiring a database that shows streets and address ranges
for those streets is very costly, and in many cases not possible.
As a result, there are many fewer choices for street address geocoding
software outside of the United States. Because the Manifold geocoder is based on
public domain government data, Manifold provides no streets database for locations
outside the United States. However, Manifold includes an option to use
Microsoft's MapPoint product as a geocoder for addresses in Canada or in eleven
European countries. See the Geocoding with MapPoint topic.
Geocoded Tables use Decimal Degree Notation
Geocoded tables in Manifold must have valid latitude and longitude fields
consisting of degrees from 0 to +/- 180 longitude and 0 to +/- 90 latitude, with
partial degrees denoted as decimal fractions. A minus sign denotes West
longitudes and South latitudes. This style of writing latitudes and longitudes is
called decimal degrees.
Like all modern GIS packages, Manifold uses decimal degree notation because it
is an unambiguous standard that is well suited for arithmetic operations and
can be written in database tables as text fields or numeric fields. Older
methods of writing latitudes and longitudes, such as the use of the letters "E",
"W", "N" and "S" or the use of degrees, minutes and seconds notation are not well
standardized and involve clumsy notation that is not very useful in computing
operations. Manipulating values such as East 32 42' 15" is somewhat akin to trying to do longhand multiplication using Roman
numerals… not very efficient or sensible.
In modern times most databases of geocoded information use decimal degrees.
However, over the years there have been many different styles used to write
latitude and longitudes in database tables. Older tables might use text fields to
express coordinates in the form of degrees, minutes and seconds, for example.
Other tables may use degrees, minutes and decimal fractions of degrees. Some
tables will denote longitudes in degrees from 0 to 360. Others might use text
strings and prefix a letter, such as "N", "S", "E" and "W" to indicate North or
South latitudes and East or West longitudes.
Manifold's approach to dealing with such tables is to import them into
Manifold where Transform toolbar operators and other tools can be used to convert coordinates into standard
decimal degree notation. This allows the full power of Manifold tools to be
brought to bear to adjust the table into the desired form. Clever use of token
operations will allow transformation of any format into the desired decimal
degrees. See, for example, the Using Tokens and Text Strings topic and the Extract Last Names using Tokens example.
If you have a table that has latitude and longitude values using some
old-fashioned notation you must first translate those values into modern decimal degree
notation. Only then is it a geocoded table.
"Generic" Geocoding Strategies
Geocoding a table by specific addresses is often not required. Although it is
easy to understand the conceptual appeal of adding an exact latitude and
longitude position for each customer record by address, such geocoded tables also
lay a conceptual trap for the unwary in that they are intrinsically inaccurate.
Sometimes it is better to have an approximate table that does not lay claims to
false accuracy. For many GIS purposes it may be enough to simply pin down a
customer location to a specific ZIP code and not to a specific city block. See
the Spatial Geocoding with Match topic for spatial geocoding within Manifold.
By spatially geocoding tables using key fields we can often end up with a
geocoded table that combines our desired records with a latitude and longitude
position for each record. The classic example is displaying customer address
records using their ZIP codes. If we have a drawing that shows a point for each
ZIP code centroid we can merge the customer address table into this drawing using
the ZIP code as a key field. In that case, customer records will appear as a
point at the ZIP code centroid for their ZIP code. [The Manifold street
address geocoding engine can also geocode addresses that consist only of ZIP codes as
well, by geocoding the address to the ZIP code centroid, so as a practical
matter there is no need to use Match to geocode to ZIP codes in the US.]
This "generic" method of geocoding is often the only possible method of
geocoding for international users who do not have access to a street address geocoder
for their locations but who do have postal code or telephone code maps or
other data sets that can be used as guides for spatial geocoding based on Match.
See Also
Creating Drawings from Geocoded Tables
Linked Drawings and Geocoded Tables
Geocoding
Street Address Geocoding
Geocoding with MapPoint
Spatial Geocoding with Match
Create a Linked Drawing from a Geocoded Table
Create a Map from a Geocoded Table
Notes
The table of sushi restaurants lists sushi restaurants approximately within
one mile (1.6 km) of the main USGS facility in Menlo Park. The map in the
illustration shows the USGS facility as a yellow diamond and plots the restaurants as
green dots. Once we get over our astonishment at the provinciality of a
region that can support no more than eleven sushi restaurants per mile, we can see
that the restaurants in Palo Alto in the lower right are more tightly clustered
than those in Menlo Park, which tend to be spread out along a single main road.
Back to Manifold Home Page