Content standards for digital geospatial metadata -- FGDC
Organization of the Standard
Numbered Sections
The standard is organized in a hierarchy of data elements and compound
elements that define the information content for metadata to document a
set of digital geospatial data. The starting point is "metadata" (section
0). The compound element "metadata" is composed of other compound
elements representing different concepts about the data set. Each of
these compound elements has a numbered section in the standard. In each
numbered section, these compound elements are defined by other compound
elements and data elements. The section "contact information" is a
special section that specifies the data elements for contacting
individuals and organizations. This section is used by other sections,
and is defined once for convenience.
Each section begins with the name and definition of the compound element
that defines the section. The name and definition are followed by
production rules (see below) that define this compound element in terms of
data elements, either directly or by the use of intermediate compound
elements. When intermediate compound elements are used, the production
rules for these elements also are provided in this part of the section.
The production rules are followed by a list of names and definitions of
compound elements and data elements used in the section.
Compound Elements
A compound element is a group of data elements and other compound
elements. All compound elements are described by data elements, either
directly or through intermediate compound elements. Compound elements
represent higher-level concepts that cannot be represented by individual
data elements. The form for the definition of compound elements is:
- Compound element name -- 0.0.0
- definition. Compound.
The type of "compound" uniquely identifies the compound elements in the
lists of terms and definitions.
Production Rules
A production rule specifies the relationship between a compound element,
and data elements and other (lower-level) compound elements. Each
production rule has a left side (identifier) and a right side (expression)
connected by the symbol "=", meaning that the term on the left side is
replaced by or produces the term on the right side. Terms on the right
side are either other compound elements or individual data elements. By
making substitutions using matching terms in the production rules, one can
explain higher-level concepts using data elements.
The symbols used in the production rules have the following meaning:
- =
- is replaced by, produces, consists of
- +
- and
- [|]
- selection - select one term from the list of enclosed terms (exclusive or). Terms are separated by "|".
- m{}n
- iteration - the term(s) enclosed is(are) repeated from "m" to "n" times
- ()
- optional - the term(s) enclosed is(are) optional
Examples:
- a = b + c
- "a consists of b and c"
- a = [b | c]
- "a consists of one of b or c"
- a = 4{b}6
- "a consists of four to six occurrences of b"
- a = b + (c)
- "a consists of b and optionally c"
Interpreting the production rules:
- The terms bounded by parentheses, "(" and ")", are optional and are
provided at the discretion of the data producer. If a producer chooses to
provide information enclosed by parentheses, the producer shall follow the
production rules for the enclosed information. For example, if the
producer decides to provide the optional information described in the term:
(a + b + c)
the producer shall provide a and b and c.
Only for terms bounded by parentheses does the producer have the
discretion of deciding whether or not to provide the information.
- The variation among the ways in which geospatial data are produced and
distributed, the fact that all geospatial data do not have the same
characteristics, and the issue that all details of data sets that are in
work or are planned may not be decided, caused the need to express the
concept of "mandatory if applicable." This concept means that if the data
set exhibits (or, for data sets that are in work or planned, it is known
that the data set will exhibit) a defined characteristic, then the
producer shall provide the information needed to describe that
characteristic. This concept is described by the production rule:
0{ term }1
Data Elements
A data element is a logically primitive item of data. The entry for a
data elements includes the name of the data element, the definition of the
data element, a description of the values that can be assigned to the data
element. The form for the definition of the data elements is:
- Data element name -- 0.0.0
- definition.
- Type:
- Domain:
If a data element is described earlier on the same HTML page, the section
number is put in parentheses. If the data type is described on another
HTML page, a link is given to the page.
The information about the values for the data elements include a
description of the type of the value, and a description of the domain of
the valid values. The type of the data element describes the kind of
value to be provided. The choices are "integer" for integer numbers,
"real" for real numbers, "text" for ASCII characters, "date" for day of
the year, and "time" for time of the day.
The domain describes valid values that can be assigned to the data
element. The domain may specify a list of valid values, references to
lists of valid values, or restrictions on the range of values that can be
assigned to a data element.
The domain also may note that the domain is free from restrictions, and
any values that can be represented by the "type" of the data element can
be assigned. These unrestricted domains are represented by the use of the
word "free" followed by the type of the data element (that is, free text,
free date, free real, free time, free integer).
Some domains can be partly, but not completely, specified. For example,
there are several widely used data transfer formats, but there may be many
more that are less well known. To allow a producer to describe its data
in these circumstances, the convention of providing a list of values
followed by the designation of a "free" domain was used. In these cases,
assignments of values shall be made from the provided domain when
possible. When not possible, providers may create and assign their own
value. A created value shall not redefine a value provided by the
standard.
Another issue is the representation of null values (representing such
concepts as "unknown") in the domain. While this is relatively simple for
textual entries (one would enter the text "Unknown"), it is not as simple
for the integer, real, date, and time types (for example, which integer
value means "unknown"?). Because conventions for providing this
information vary among implementations, the standard specifies what
concepts shall be represented, but does not mandate a means for
representing them.
In addition to the values to be represented, the form of the
representation also is important, especially to applications that will
manipulate the data elements. The following conventions for forms of
values for data elements shall be used:
- Calendar Dates (Years, Months, and Days)
- A.D. Era to December 31, 9999 A.D. -- Values for day and month of
year, and for years, shall follow the calendar date convention (general
forms of YYYY for years; YYYYMM for month of a year (with month being
expressed as an integer), and YYYYMMDD for a day of the year) specified in
American National Standards Institute, 1986, Representation for calendar
date and ordinal date for information interchange (ANSI X3.30-1985): New
York, American National Standards Institute (adopted as Federal
Information Processing Standard 4-1).
- B.C. Era to 9999 B.C. -- Values for day and month of year, and for
years, shall follow the calendar date convention, preceded by the lower
case letters "bc" (general forms of bcYYYY for years; bcYYYYMM for month
of a year (with month being expressed as an integer), and bcYYYYMMDD for a
day of the year).
- B.C. Era before 9999 B.C. -- Values for the year shall consist of as
many numeric characters as are needed to represent the number of the year
B.C., preceded by the lower case letters "cc" (general form of
ccYYYYYYY...).
- A.D. Era after 9999 A.D. -- Values for the year shall consist of as
many numeric characters as are needed to represent the number of the year
A.D., preceded by the lower case letters "cd" (general form of
cdYYYYYYY...).
- Time of Day (Hours, Minutes, and Seconds)
- Because some geospatial data and related applications are sensitive to
time of day information, three conventions are permitted. Only one
convention shall be used for metadata for a data set. The conventions are:
- Local Time. For producers who wish to record time in local time,
values shall follow the 24-hour timekeeping system for local time of day
in the hours, minutes, seconds, and decimal fractions of a second (to the
precision desired) without separators convention (general form of
HHMMSSSS) specified in American National Standards Institute, 1986,
Representations of local time of day for information interchange (ANSI
X3.43-1986): New York, American National Standards Institute (adopted as
Federal Information Processing Standard 58-1).
- Local Time with Time Differential Factor. For producers who wish to
record time in local time and the relationship to Universal Time
(Greenwich Mean Time), values shall follow the 24-hour timekeeping system
for local time of day in hours, minutes, seconds, and decimal fractions of
a second (to the resolution desired) without separators convention. This
value shall be followed, without separators, by the time differential
factor. The time differential factor expresses the difference in hours
and minutes between local time and Universal Time. It is represented by a
four-digit number preceded by a plus sign (+) or minus sign (-),
indicating the hours and minutes the local time is ahead of or behind
Universal Time, respectively. The general form is HHMMSSSSshhmm, where
HHMMSSSS is the local time using 24-hour timekeeping (expressed to the
precision desired), 's' is the plus or minus sign for the time
differential factor, and hhmm is the time differential factor. (This
option allows producers to record local time and time zone information.
For example, Eastern Standard Time has a time differential factor of
-0500, Central Standard Time has a time differential factor of -0600,
Eastern Daylight Time has a time differential factor of -0400, and Central
Daylight Time has a time differential factor of -0500.) This option is
specified in American National Standards Institute, 1975, Representations
of universal time, local time differentials, and United States time zone
reference for information interchange (ANSI X3.51-1975): New York,
American National Standards Institute (adopted as Federal Information
Processing Standard 59).
- Universal Time (Greenwich Mean Time). For producers who wish to
record time in Universal Time (Greenwich Mean Time), values shall follow
the 24-hour timekeeping system for Universal Time of day in hours,
minutes, seconds, and decimal fractions of a second (expressed to the
precision desired) without separators convention, with the upper case
letter "Z" directly following the low-order (or extreme right hand) time
element of the 24-hour clock time expression. The general form is
HHMMSSSSZ, where HHMMSSSS is Universal Time using 24-hour timekeeping, and
Z is the letter "Z". This option is specified in American National
Standards Institute, 1975, Representations of universal time, local time
differentials, and United States time zone reference for information
interchange (ANSI X3.51-1975): New York, American National Standards
Institute (adopted as Federal Information Processing Standard 59).
- Latitude and Longitude
- Values for latitude and longitude shall be expressed as decimal
fractions of degrees. Whole degrees of latitude shall be represented by a
two-digit decimal number ranging from 0 through 90. Whole degrees of
longitude shall be represented by a three-digit decimal number ranging
from 0 through 180. When a decimal fraction of a degree is specified, it
shall be separated from the whole number of degrees by a decimal point.
Decimal fractions of a degree may be expressed to the precision desired.
- Latitudes north of the equator shall be specified by a plus sign (+),
or by the absence of a minus sign (-), preceding the two digits
designating degrees. Latitudes south of the Equator shall be designated
by a minus sign (-) preceding the two digits designating degrees. A point
on the Equator shall be assigned to the Northern Hemisphere.
- Longitudes east of the prime meridian shall be specified by a plus
sign (+), or by the absence of a minus sign (-), preceding the three
digits designating degrees of longitude. Longitudes west of the meridian
shall be designated by minus sign (-) preceding the three digits
designating degrees. A point on the prime meridian shall be assigned to
the Eastern Hemisphere. A point on the 180th meridian shall be assigned
to the Western Hemisphere. One exception to this last convention is
permitted. For the special condition of describing a band of latitude
around the earth, the East Bounding Coordinate data element shall be
assigned the value +180 (180) degrees.
- Any spatial address with a latitude of +90 (90) or -90 degrees will
specify the position at the North or South Pole, respectively. The
component for longitude may have any legal value.
With the exception of the special condition described above, this form is
specified in Department of Commerce, 1986, Representation of geographic
point locations for information interchange (Federal Information
Processing Standard 70-1): Washington, Department of Commerce, National
Institute of Standards and Technology.
- Network Addresses and File Names
- values for file names, network addresses for computer systems, and
related services should follow the Uniform Resource Locator convention of
the Internet when possible. For additional details about the Uniform
Resource Locator, see
http://www.ncsa.uiuc.edu/demoweb/url-primer.html
June 8, 1994