USGS - science for a changing world

USGS Education

USGS Education Home / Primary Education / Secondary Education / Undergraduate Education / News and Information

Lakota Studies 400/600:  Special Topics:  Introduction to Geographic Information Systems and Science

Week 3 Notes:  Data Quality

Data Quality

As we progress through the course, and work with different data sets, you'll notice that some data is complete, while other data is incomplete.  For example, the demographic data you worked with during Week 1 was missing some population and other variables, usually indicated by the -99999's in the tables.  Think about the advantages of coding missing data with a negative number rather than a "0".  What is different about 0 versus a negative number when you are conducting searches and querying the data?  

Data quality is a critically important issue within the geographic information sciences.  In today's world, people tend to think that if data is on a computer or other high-tech device, then it is error-free, at worst, or is better than the original paper documents that they were derived from, at best.  In addition, because a GIS can represent absolute locations to many decimal places, such as 43.10474 north latitude, then the data is more accurate than a paper map.  A GIS may be more precise, able to represent data and attributes with more decimal places, but it is not necessarily more accurate.

Nothing could be further from the truth.  Maps are scientific documents with standards; standards that vary widely depending on the date, producer, the purpose, and the scale.  All maps are inherently inaccurate because they all attempt to represent the three-dimensional Earth on a two-dimensional piece of paper or computer screen.  That is why all spatial data comes in map projections that distort the Earth's distances, direction, shapes, angles, sizes, or other aspects.  

Digital maps are oftentimes no more accurate than the original paper documents.   Roads, for example, on USGS topographic maps were offset from railroads if they ran parallel to each other for legibility purposes on paper.  In other words, it is a cartographic document, rather than truly representing the location of each feature.  In a digital file of the same data, people often mistake these cartographic data files for geographic data files, where each feature is meant to truly represent its actual location on the Earth's surface.  They need to recognize that the road might be offset 20 meters from its true location.

Furthermore, the capability of a GIS to examine the Earth at an infinite variety of scales often creates the illusion that data is more accurate than it really is.  If a map of a set of pipelines, for example, was originally created at 1:2,000,000 scale, one can zoom in to the data at 1:10,000 scale or even in more detail.  However, the data is no more accurate at that scale than 1:2,000,000.  Why should we care?  In this example, we certainly would not want to lay additional pipeline according to a GIS dataset created at this scale.  Indeed, a whole GIS legal profession has sprung up because of the inappropriate use of spatial digital data.  In the example above, if a company sent their crews out to dig in a certain area because they did not truly understand where a pipeline was, and if the crews dug in the wrong spot, this could be injurious or fatal to the members of that construction crew.

In the example above, 20 meters might not be important to some users but critical for others, such as a state highway department or a utility company that is planning to lay pipeline alongside the road.  This leads to the concept of Fitness for Use.  What data is considered "of good quality?"  That cannot be answered in one sentence.  It is the data users responsibility to determine if the data is fit to use  for their own application.

It is the data producer's responsibility to provide truth in labeling.  They should provide information about who created the data, when it was created, for what purpose, the scale it was created at, the source documents, the content, the field names in the tables, the map projection, and so on.  This is done with metadata, or data describing the spatial data.  Metadata files should be included with any spatial data set you use.  However, the reality is that the metadata are oftentimes missing.  All federally-created spatial data is required to use the metadata standards from the Federal Geographic Data Committee, FGDC, at http://www.fgdc.gov.  Again, the user has to decide whether the data is fit to use, which is a particular challenge if the metadata are absent.

None of these accuracy issues are new.  They have existed ever since maps have been drawn.  And accuracy issues in digital data are not confined to maps.  They affect ALL data types.  None of these issues take away from the incredible power of GIS analysis, either.  It is just something to continually keep in mind.  The bottom line:  Be critical of the data!  Never assume data quality.  Ask questions about it.  Examine the metadata file.  At each step in your GIS analysis, examine the results and make sure you are getting back answers to what you think you asked.   We'll discuss data quality in more detail later in the course.


Bibliographies

This week, your first annotated bibliography will be assigned.  See the assignment for more details.  You will discover through searching the GIS-related literature that the applications of GIS really have no limit.  It can be applied to everything from cultural resource management, to water resource management, to urban and regional planning, to retail site location, to natural hazard mitigation, to emergency management, to health and meddicine, to habitat studies, to climate and weather, to education, and a great deal more.  

Back to SGU GIS Course Home

Author:  Joseph J. Kerski, Geographer, USGS, jjkerski@usgs.gov, 303-202-4315 

Accessibility FOIA Privacy Policies and Notices

Take Pride in America logo USA.gov logo U.S. Department of the Interior | U.S. Geological Survey
URL: http://education.usgs.gov/common/lessons/dataqualitynotes.html
Page Contact Information: USGS Education Web Team
Page Last Modified: Wednesday, 29-Aug-2007 15:12:03 EDT