Coastal Services Center

National Oceanic and Atmospheric Administration


MetaScribe: A New Tool for Metadata Generation


Introduction

The NOAA Coastal Services Center's MetaScribe tool is designed to reduce significantly the labor required to produce metadata compliant with the Federal Geographic Data Committee (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM). This software tool takes advantage of the fact that, within a collection of records for a given data type, the records are generally very similar in content, with only a few fields or phrases changing from one record to the next. MetaScribe delivers the greatest savings when used to create a collection of similar records. In fact, this tool is not appropriate for the creation of multiple metadata records with little redundant content.

MetaScribe is template driven (see example template). The user must create a metadata template, which is uploaded into MetaScribe. Building a template is not a trivial task; however, once a template is created for a given data type, the user can create multiple records quickly and easily.

MetaScribe uses two proofing tools to check the records it creates: cns (Chew and Spit) and mp (Metadata Parser). These programs, created by the U.S. Geological Survey (USGS), will report any errors within a record and require that those errors be fixed before a final metadata record is produced. This illustrates the importance of creating a proper template that will produce error-free records.


Template Construction

A working understanding of the FGDC CSDGM is needed to create a template (or multiple templates for multiple data types). These templates are kept and maintained locally on your system. A new template for a particular data type is warranted when the labor of building a new template is less than the continued use of an existing template.

Templates must be plain text. A template is created by writing a metadata record that contains special tags. Each tag specifies an input field. Values entered into each of the resulting input fields replaces the associated tags in the template to produce a completed metadata record. The tags resemble HTML tags with a tag name and several attributes. (View both an example template and the data entry page it produces.)

The suggested process for creating a template is as follows:

  1. Write your first metadata record. As you write, keep in mind that the record will be used as a template for future records.
  2. Use cns to check the format of your record and repair any problems.
  3. Use mp to verify that it is compliant. Repair any resulting problems.
  4. Identify the portions of text or the fields that will vary from record to record and replace them with tags formatted according to the following:
    <tag_name type=" " label=" " value=" " [fgdc=" "]>

Each attribute (type, label, value, and fgdc) must be followed by an equal sign and a value in double quotes. Do not use double quotes in an attribute's content!

Creating Tags

The tag_name must be unique to the value you want inserted into the record. Multiple tags with the same name will result in a single input field and all instances of that tag in the template will be filled with the same content. For instance, if a particular date will be used repeatedly, include a tag in your template at each point where the date will be used. Name each tag with the same tag_name. The resulting data entry form will have one entry box for the date and the value entered will be used in all occurrences of the tag when the metadata record is produced.

The type attribute must be one of textbox, textarea, checkbox, select, or fgdc_list.

Data Entry Field Types
Tag type Description
textbox A single-line, text entry field.
textarea A multi-line, text entry field.
checkbox The value will be included in the metadata record if the checkbox is checked.
select Comma delimited values will be presented as options in a pull-down menu.
simple_list

A multi-line, text entry field containing a list of items. Entries must be one per line. Each line will be added to a comma-separated list in the resulting record.

fgdc_list

A multi-line, text entry field containing a list of items. Entries must be one per line. Each line is added to the resulting metadata record after the associated FGDC attribute value.

The label will show up as the label for your input field.

The value attribute will contain the default content for the field. For text entry fields, the value may also be instructions to the user about how to fill the field correctly, or it can be left empty if no default value is desired. In a select tag, the value attribute should contain a comma-separated list of the items to appear in the pull-down menu.

The fgdc attribute is applicable to the fgdc_list tags, only. It must be supplied in an fgdc_list tag. The value for this attribute must be an FGDC field name exactly as it appears in the CSDGM. The value of this attribute, like Place_Keyword, will be added before each item entered in the associated data-entry field.

Since the tags are enclosed by the "less than" (<) and "greater than" (>) symbols, you MUST NOT use these symbols in your template, except as tag punctuation.

Tag Examples

An example of a textbox tag for the Originator field:

<originator type="textbox" value="NOAA Coastal Services
    Center" label="Originator">

That tag would produce this input field.

image of data entry cell

Example textarea tag:

<use_constraints type="textarea" value="None"
    label="Use constraints:">

Example checkbox tag:

<use_this_string type="checkbox" value="text to insert"
    label="Include this or not?">

Example select tag:

<name_of_list type="select" value="1,2,3,4"
    label="Pick one:">

Example fgdc_list tag:

<placekey type="fgdc_list" value="Enter place keywords -
    separated by newline." label="Place Keywords:"
    fgdc="Place_Keyword">

Test Your Template

Once all variable portions of your record have been replaced with tags, use the tag checker to ensure that you have not left out quotes or forgotten to close a tag with the > symbol. To use the on-line tag checker, upload your template into the checker or cut-and-paste your template code into the form and submit it. The checker will attempt to parse the tags in your template and present you with a table showing the unique tags and their attributes. If the checker shows correctly parsed the tags, you are ready to fine-tune your formatting.

The last step in creating a template is to try to wrap your text and place the tags so that your template will produce text that is "wrapped" reasonably well. MetaScribe uses 'cns' to clean up the record format and then uses a simple line wrapping routine to wrap long lines. When a template is used in MetaScribe, this can—on occasion—produce lines with only a few words. If this is unacceptable, you can structure your template to avoid line wrapping. This sometimes takes one or two iterations to get the desired formatting.


Build Metadata Easily

After creating a well-formed template, follow the steps below to create metadata.

  1. Either upload or cut-and-paste the template into MetaScribe. Be aware that some Netscape browser versions will handle a maximum of 30,000 bytes (about 500 lines) in a text area. If your template is anywhere near that length, please use the upload option.
  2. You will be presented with appropriate data-entry fields as defined in your template. Fill the fields, as appropriate, for your data and submit the form.
  3. MetaScribe will capture your inputs, replace each tag in the template with the submitted input, and run cns and mp on the resulting record. If cns and mp find no errors, the new record is displayed in your browser. You are also provided with a link to save the record and a link to request an e-mail with your record attached.

Your template is written to a temporary disk cache on the Web server and kept for about 4 hours after the template is first uploaded (or you complete a record and begin a new record with the same template). At any time after that, it may be purged from the system. Therefore, please complete the input fields in your record within four hours. If you fail to complete the record in that time, you will have to upload the template again and start over.

Making Sense of cns Errors

Reliance on the USGS cns and mp tools introduces some behaviors into MetaScribe that should be noted. If the content of an FGDC field contains a line which begins with an FGDC field name but is not the beginning of an FGDC field, the cns tool will fail to parse the record correctly and will report an error. For instance, Description is a field name. If the Abstract section contains a line that starts with Description, then the cns tool interprets that second occurrence of the string as an FGDC field tag and determines that it is in the incorrect place in the record structure. If this occurs in your metadata, you must adjust the format of your template.

The following example of an Abstract section will cause an error.

Description:
  Abstract:
    This data set serves as a definition and
    description of something worth knowing about.

MetaScribe will show an error like the following:

The 'cns' pre-parser returned the following problems:
  29:  of something worth knowing about.

Note that the error does not mention the word "description." If you get such an error, examine your template for field names at the beginning of the reported line or a line very near it. To repair this in your template, simply move the word "description" up onto the end of the previous line.

Description:
  Abstract:
    This data set serves as a definition and description
    of something worth knowing about.

Planned Development

Items under consideration:

  • Examine and possibly replace < and > as tag punctuation.
  • Ideas? Please e-mail your ideas. Although we make no promises about future development, we are definitely interested in good ideas and/or feedback about existing functionality.

Other Center Software Tools and Extensions