USGS - science for a changing world

USGS Geoscience Data Catalog

Improving access to metadata using keywords from controlled vocabularies

This information was also published as part of USGS Open-File Report 01-223

On this site I am exploring the use of controlled vocabularies in the Keywords section of the FGDC metadata. By controlled vocabularies I refer to formally-defined lists of terms, usually hierarchical, that are preferred for use in specific ways. By controlled I mean that you can't just use any terms you like. The list is maintained by an authority (a person or group) who ensures that the terms are all defined consistently and have well-defined relationships. I am using two such vocabularies, one for place names and the other for theme keywords.

Place names

For place names I have chosen to use two Federal Information Processing Standards (FIPS), 6-4 and 10-4. FIPS 6-4 specifies numerical codes for states and counties (or equivalent entities) in the US and its territories. Each state is identified using a two-digit number, and each county within the state is identified using a three-digit number. Thus a county can be unambiguously identified using a five-digit code consisting of its state code and its county code. Unique codes are needed for these place names because many states have counties with the same name (for example Jefferson, Washington, Lincoln, Grand, are common county names).

FIPS 10-4 specifies alphanumeric codes for countries of the world and first-order subdivisions of them. Of the first-order subdivisions I have used only states in the United States and in Mexico and provinces in Canada. This decision reflects the distribution of data that I wish to categorize by place.

I have augmented these standard place names with names of major oceanic regions and names of continental regions. These groupings allow me to build a pick-list interface with a relatively narrow and deep hierarchy, so that users don't have too many choices at the highest level, where they begin to choose places.

Place Keyword Assistant: A tool to select place names for metadata

Place names by themselves don't help much; the key is to associate each record with the corresponding place names from the controlled vocabulary. You can do this manually, of course, using your favorite text editor or Tkme. Just add lines like this:

    Keywords:
      Place:
        Place_Keyword_Thesaurus: Augmented FIPS 10-4 and FIPS 6-4, version 1.0
        Place_Keyword: US56 = Wyoming

But when you're dealing with a large number of records, it helps to use a specialize tool for this purpose. The tool I've developed is called the Place Keyword Assistant. This tool is written in Tcl/Tk, so to use this tool, you'll need to install Tcl/Tk on your system and also install the mq extension that enables Tcl/Tk scripts to read, modify, and write FGDC metadata.

The Place Keyword Assistant has the following major functions:

  1. Read metadata records. Metadata records may be
    1. named on the command line
    2. listed in a file that is named on the command line, or
    3. found recursively from current directory and its subdirectories.
  2. Display each record as it is selected. The text is shown in a simple scollable window.
  3. Present hierarchical place keywords for the user to choose, and keep track of keywords that have been chosen.
  4. Save the selected place keywords in the metadata record.

The Place Keyword Assistant creates three windows as shown in the reduced-size image here. One contains a list of metadata records (by file name) that you can edit. It creates this list by drilling downward through all of the directories below the one where the program is stored. Choose a metadata record from this list. Entries shown in green have some place keywords assigned using this software; those shown in red might have place keywords but not keywords chosen from this list. The second window simply shows the text of each record as it is selected. The third window shows you the place names that you can assign to the metadata record.
List of metadata records by directory and file name; green means keywords have been assigned;
red means keywords need to be assigned
reduced-size image showing the three windows of the place keyword assistant Keyword chooser window (see full-sized example below).
Text of the metadata record.

The keyword chooser window is shown below at full size. It consists of five list windows each of whose contents are determined by the window to its left. In this example the user chose Land from among Oceans and Land, then North America from the list of continents, then United States from the list of countries in North America, then the state of Arizona, and from its counties the one named Graham. The list in the lower right corner contains those places whose names have been selected for inclusion in the metadata record. Its background is blue to distinguish it from the others visually, and its entries include the unique FIPS code associated with each area.

full-sized view of the keyword selection window
Buttons, keystroke equivalents, and what they do
Button Keystroke Description
AddEnterInclude the most specific place name currently selected.
RemoveBackspaceRemove the selected place name from the list.
Clear(none)Remove all place names from the list.
SaveCtrl-SWrite the listed place names into the metadata record.
PrevCtrl-PWrite the listed place names into the metadata record and load the previous metadata record from the file list.
SkipPgDnLeave the current metadata record unchanged and load the next record from the list.
NextCtrl-NWrite the listed place names into the metadata record and load the next metadata record from the file list.
Choices on the Edit menu
CopyCtrl-CCopy the current list of place keywords to the clipboard.
PasteCtrl-VPaste the clipboard's contents into the current record's list of place keywords.

Detailed installation instructions for use on MS-Windows

  1. Install Tcl/Tk
    1. Download tcl832.exe.
    2. Run tcl832.exe. Choose default install in C:\Program Files\Tcl.
    3. Restart. This makes the system recognize file names ending with .tcl as Tcl scripts.
  2. Install MQ
    1. Download the big package of metadata tools for MS-Windows.
    2. Run all_win.exe. Allow the installer to store the files in C:\USGS.
    3. Copy C:\USGS\tools\bin\mq25.dll into C:\Program Files\Tcl\lib.
    4. Create directory C:\Program Files\Tcl\lib\mq.
    5. Copy C:\USGS\tools\bin\pkgIndex.tcl into C:\Program Files\Tcl\lib\mq.
    6. Test by running Wish
      1. Choose Wish from the Start menu, following Programs > Tcl > Wish.
      2. Two windows appear. One is labeled "Console" and contains a prompt (percent sign). Click this window.
      3. At the % prompt, type package require mq then press Enter.
      4. The interpreter should respond with the version number of mq. At this writing this value is 2.5.5. If you get an error message instead, something didn't get installed right.
  3. Install Place Keyword Asssistant
    1. Find a directory above where you have stored your metadata. There can be other files in its subdirectories, but this works out-of-the-box if your metadata files all have the extension .met. For this example, suppose this is D:\data.
    2. Download placekey.txt and save in D:\data.
    3. Download placer.tcl and save in D:\data. This file should have a "Tk" icon.
    4. Double-click placer.tcl. Good things should happen.

Using ArcExplorer 3 with the Place Keyword Assistant

ArcExplorer 3 map showing counties in the southwest US with some data over them, 8 counties of New Mexico selected

ESRI's ArcExplorer 3 can be used to display US counties (here focusing on the Southwest) with scientific data overlying the county boundaries. Because the counties are shown as polygons, these can be selected when their layer is made active. After selecting the counties that overlap the scientific data, the user clicks on the Attributes button in the ArcExplorer toolbar to bring up the table of attributes of the selected counties. This table is divided in two panes by a vertical bar. In the left pane the names of the selected counties are shown. The right pane contains the attributes of the county selected last.

Note that what ArcExplorer shows in the left side of the attribute window is the first item of the layer's DBF file that is not an intrinsic attribute of ArcInfo. The counties layer I have used here was downloaded from the National Atlas of the US. I modified the DBF file by deleting the ArcInfo intrinsic attributes and swapping the column positions of the state name and county name attributes, so that the county name comes first.

Accessibility FOIA Privacy Policies and Notices

Take Pride in America logo USA.gov logo U.S. Department of the Interior | U.S. Geological Survey
URL: http://geo-nsdi.er.usgs.gov/keywords.shtml
Page Contact Information: Peter Schweitzer
Page Last Modified: Friday, 19-Oct-2007 13:26:29 EDT