How to Use HEDS -- Reference

HEDS SCREENS

HEDS Home Page
HEDS Studies
Study Information Directory
Download Complete Data Set
Select Data Set Columns
Selected Data Set Columns
Download Customized Data Set
Download Code Set for Customized Data Set
Download Data Dictionary for Customized Data Set
EIMS Navigation Bar Links

INFORMATION ON DOWNLOADS

Information on Downloads

The Human Exposure Database System (HEDS) enables you to view and download human exposure study information. HEDS uses the EPA's Environmental Information Management System (EIMS) as its metadata repository. Therefore, when you are using the system, you can expect to see pages from both HEDS and EIMS.

To access all of the EIMS functions described below, it is necessary to have entered through the HEDS portal of EIMS. (The HEDS portal is the pathway into the HEDS portion of the EIMS system.) If you have entered through another EIMS portal, you can access some but not all of these functions on the EIMS pages. If you do not see the HEDS logo at the top of the screen or the HEDS-Home link at the bottom, you are not in the HEDS portal. To get to the HEDS portal, use the following URL: http://www.epa.gov/eims/?p=ord-heds. The HEDS portal in EIMS focuses your searches to HEDS entries within EIMS.

To get back to HEDS from EIMS at any time, click on the EIMS Download link for a HEDS entry and then click on one of the HEDS links or click on the HEDS Home link in the footer.

Every data set, document, and study in HEDS is assigned an Entry ID number, which is used in both HEDS and EIMS.

The term "active," when used with study, data set, or customized selection, refers to the most recent choice of that item type.

HEDS is optimized for use with the Netscape browser version 4.0 or higher. Some functions may operate differently for other browsers. Also, some functions may operate differently depending on user settings in the browser. In most cases, the difference is not significant. In instances where it may be, additional information is provided.

Note that you can Back out of this page at any time to try any of the HEDS Studies links described herein. You can get back to this page by returning to the HEDS Home Page and then clicking How To Use--Reference on the navigation bar or Help in the page footer.

DISCLAIMER

Product and corporate names may be trademarks or registered trademarks of commercial companies and are used only for explanation and to the owners' benefit, without intent to infringe. Mention of trade names or commercial products does not constitute endorsement or recommendation for use.

HEDS HOME PAGE

The HEDS Home Page navigation bar includes the following links:

About HEDS. Provides general information.
How to Use. Provides help for navigating through the system. Two links are available, one for First Time User and one for Reference.
HEDS Studies. Provides access to actual studies. (This is the link you will use most to obtain information.)
What's New. Provides information on new studies and new functionalities in HEDS.
Related Web Sites. Provides links to other web sites with related information.
Contact Us. Provides information about how to give feedback on or ask questions about HEDS.
Accessibility. Provides information for users with accessibility needs.

The main window includes two links:

Additional Software. Provides links to other web sites with related information.
Accessibility Options. Provides information for users with accessibility needs.

HEDS STUDIES

This is the starting page for retrieving any HEDS-related information. This page contains links for HEDS projects and studies. A HEDS study contains data sets and documents pertinent to that study. EIMS contains links to and between the data sets and documents within a study. Some HEDS studies may be related to a project, and links from the project to the related studies will be available.

The HEDS Studies page includes the same HEDS navigation bar as the HEDS Home Page. The two kinds of links on the main part of this page and the information they provide are described as follows.

Project. If you click on a project link, you will be taken to an EIMS Summary page having the EIMS navigation bar.

The EIMS Summary page includes

the abstract for the selected project
other project-identifying information

The EIMS navigation bar includes

links to additional information about the record, such as Contacts, Related Entries, Related Documents, Related Data Sets, and Downloads. Further information about links on the EIMS navigation bar is provided below.

Links at the bottom of the page provide access to other EPA sites. The links include a HEDS-Home link to enable you to get back to HEDS.

Study. If you click on a study link, you will be taken to the HEDS Study Information Directory page for a particular study.

STUDY INFORMATION DIRECTORY

This page provides the following links:

Study Metadata. This link takes you to an EIMS Summary page for the active study.
List of Data Sets. This link takes you to an EIMS Related Entries page displaying a list of data sets related to the active study. To access a data set, click on its Related Entry ID link.
List of Documents. This link takes you to an EIMS Related Entries page displaying a list of documents related to the active study. To access a document, click on its Related Entry ID link.

At the bottom of this page is a shortcut for returning directly to a previously viewed Data Set or Document for the active study. To use this feature, you must have previously accessed a data set or document and noted its Entry ID number. Enter the number in the field provided. Be sure to click the appropriate radio button for either Download Data Set or Download Document, and then click Submit. If you click the incorrect radio button for the Entry ID number you enter, you will get an error message reminding you to click the correct radio button.

If you enter an Entry ID number for a data set or document that does not belong to the active study, you will get a message informing you of that fact. Return to the Study Information Directory page and enter a number that belongs to the active study, or return to the HEDS Studies page and select the study associated with the desired Entry ID number.

Download Data Set takes you to the HEDS Download Complete Data Set page. Download Document takes you to the EIMS Downloads page for that document.

DOWNLOAD COMPLETE DATA SET

The page header includes the Entry ID number, title, and a portion of the abstract and the data use and constraints. To view the full abstract and related information, click the View Full Description link just below the text for Data Use and Constraints.

This page and the pages described below have a navigation bar different from other HEDS pages. This navigation bar includes the following sections: Complete Data Set, Customized Data Set, and General.

Return to Top

Under COMPLETE DATA SET on the navigation bar, the Download link refreshes the page. Under the Browse subheading, the links perform the same actions as the similarly named Browse links on the main part of the page. The links on the Download Complete Data Set page are described in the following paragraphs.

Immediately below the Download and Browse Options heading are links that take you to parts of this reference document. Below those links are the following links:

Download Zipped Package in dBase IV Format. This link opens a dialog box that enables you to save the file in a location of your choice. See Information on Downloads for further information about the download file.
Download Zipped Package in ASCII Format. This link opens a dialog box that enables you to save the file in a location of your choice. See Information on Downloads for further information about the download file.
Browse Complete Data Set. This link opens the Browse Complete Data Set page, which allows browsing of the first 10 rows of data. The page header includes the Entry ID number, title, and a portion of the abstract and the data use and constraints. To view the full abstract and related information, click the View Full Description link just below the text for Data Use and Constraints. Use the scroll bars to view columns and rows not displayed on your screen. See Information on Downloads and Data Set for further information about the download file.
Browse Code Set for Complete Data Set. This link opens the Browse Code Set for Complete Data Set page. The page header includes the Entry ID number, title, and a portion of the abstract and the data use and constraints. To view the full abstract and related information, click the View Full Description link just below the text for Data Use and Constraints. Use the scroll bars to view columns and rows not displayed on your screen. See Code Set for further information about the code set.
Browse Data Dictionary for Complete Data Set. This link opens the Browse Data Dictionary for Complete Data Set page. The page header includes the Entry ID number, title, and a portion of the abstract and the data use and constraints. To view the full abstract and related information, click the View Full Description link just below the text for Data Use and Constraints. Use the scroll bars to view columns and rows not displayed on your screen. To download the data dictionary by itself, click the link Download Data Dictionary for Complete Data Set. A data dictionary downloaded from this link will be in the same format as Download Data Dictionary for Customized Data Set. See Data Dictionary for further information about the data dictionary.

Return to Top

Under GENERAL on the navigation bar, the three links are described as follows.

Data Set Metadata. This link takes you to an EIMS Summary page for the active data set.
Study Directory. This link brings up the Study Information Directory page, described above.
HEDS Studies. This link brings up the HEDS Studies page, described above.

Under CUSTOMIZED DATA SET on the navigation bar, only the Select Columns link is active on this page. To create a customized data set based on the active data set, click the Select Columns link. The Select Data Set Columns page appears.

SELECT DATA SET COLUMNS

The Select Data Set Columns page enables you to select columns of a data set for browsing or downloading. As the default, all columns are deselected. To select all columns, click the Select All Columns button. All columns are selected, and the link changes to Deselect All Columns. (Note that the record identifying or key columns are always selected.)

To manually select or deselect individual columns, click in the selection box to the left of each column name. A check mark indicates that the column is selected. To clear all manually selected columns, click the Clear button. If you want most of the columns, you can click the Select All Columns link to select all the columns and then individually deselect the columns you do not want. However, if you find that you want most of the columns, it is recommended that you download the complete data set.

When you have selected the desired columns, click Submit. The Selected Data Set Columns page appears.

SELECTED DATA SET COLUMNS

This page enables you to review your column selections. If you are not satisfied with your selection, click Back to return to the Select Data Set Columns page and revise your selection, or click Select Columns in the navigation bar to make a new selection. If you are satisfied with your selection, click the Data Set link in the navigation bar. The Data Set link takes you to the Download Customized Data Set page.

Note that on the navigation bar under Customized Data Set the links for Data Set, Code Set, and Data Dictionary are now active. The Data Set link takes you to the Download Customized Data Set page. The Code Set and Data Dictionary links take you to pages that allow for downloading or browsing the code set and the data dictionary for the customized data set, respectively. See Code Set and Data Dictionary for further information.

DOWNLOAD CUSTOMIZED DATA SET

Important Download Information for Internet Explorer Users. This link takes you to information about downloads for users of the Internet Explorer browser.
Download Customized Data Set in ASCII Format. This link opens a dialog box that enables you to save the file in a location of your choice. All customized downloads from a given data set are assigned standardized filenames in the download process. If you are downloading multiple selections from the same data set, it is recommended that you assign unique names to each selection's files. See Information on Downloads and Customized Data Sets for further information.
Browse Customized Data Set. This link opens the Browse Customized Data Set page, which allows browsing of the first 10 rows of data. Use the scroll bars to view columns and rows not displayed on your screen. You may also download the data set from this page. See Data Set for further information about data sets.
Download Complete Data Set. This link, in the section under Accessibility Options, takes you to the Download Complete Data Set page, as described above. It is useful if you change your mind about creating a customized data set and decide to download the complete data set.

DOWNLOAD CODE SET FOR CUSTOMIZED DATA SET

This page allows you to download the code set for a customized data set. To reach this page, on the navigation bar, under Customized Data Set, click Code Set. The page header includes the Entry ID number, title, and a portion of the abstract and the data use and constraints. To view the full abstract and related information, click the View Full Description link just below the text for Data Use and Constraints. The links on this page are described in the following paragraphs.

Important Download Information for Internet Explorer Users. This link takes you to information for downloads when using the Internet Explorer browser.
Download Customized Code Set in ASCII Format. This link opens a dialog box that enables you to save the file in a location of your choice. All customized downloads from a given data set are assigned standardized filenames in the download process. If you are downloading multiple selections from the same data set, it is recommended that you assign unique names to each selection's files. See Code Set and Importing ASCII (.txt) Files for further information.
Browse Customized Code Set. This link opens the Browse Code Set for Customized Data Set page, which allows browsing the code set for the customized data set. You may also download the code set from this page.
Download Complete Data Set. This link, in the section under Accessibility Options, takes you to the Download Complete Data Set page, as described above. It is useful if you change your mind about creating a customized data set and decide to download the complete data set.

DOWNLOAD DATA DICTIONARY FOR CUSTOMIZED DATA SET

This page allows you to download the data dictionary for a customized data set. To reach this page, on the navigation bar, under Customized Data Set, click Data Dictionary. The page header includes the Entry ID number, title, and a portion of the abstract and the data use and constraints. To view the full abstract and related information, click the View Full Description link just below the text for Data Use and Constraints. The links on this page are described in the following paragraphs.

Important Download Information for Internet Explorer Users. This link takes you to information for downloads when using the Internet Explorer browser.
Download Customized Data Dictionary in ASCII Format. This link opens a dialog box that enables you to save the file in a location of your choice. All customized downloads from a given data set are assigned standardized filenames in the download process. If you are downloading multiple selections from the same data set, it is recommended that you assign unique names to each selection's files. See Data Dictionary and Importing ASCII (.txt) Files for further information.
Browse Customized Data Dictionary. This link opens the Browse Data Dictionary for Customized Data Set page, which allows browsing the data dictionary for the customized data set. You may also download the data dictionary from this page.
Download Complete Data Set. This link, in the section under Accessibility Options, takes you to the Download Complete Data Set page, as described above. It is useful if you change your mind about creating a customized data set and decide to download the complete data set.

Return to Top

EIMS NAVIGATION BAR LINKS

As indicated above, some HEDS links take you to EIMS. In EIMS pages, the navigation bar may show both active and inactive links, with inactive links in a subdued color or grayed out. Of the available links, the following are particularly useful to HEDS users.

Summary. This link provides an abstract of the selected project, study, data set, or document, together with identifying and general information.
Contacts. This link provides names of persons or organizations with knowledge of or responsibilities for the entry, together with contact information.
Related Entries. This link provides a list of all entries related to the selected project, study, data set, or document. To view an entry, click on its Related Entry ID link.
Related Documents. This link provides a list of documents related to the selected entry. To view a document, click on its Related Entry ID link.
Related Data Sets. This link provides a list of data sets related to the selected entry. To view a data set, click on its Related Entry ID link.
Downloads. This link provides a list of items that allow actual downloading or that provide links back to HEDS, where you can access additional information through HEDS functionalities. With each item in the list is a button marked View/Download File. Click this button to be taken to the appropriate page. If the item is HEDS Studies, you will be taken to the HEDS Studies page. If the item is Study Directory, you will be taken to the HEDS Study Information Directory page. If the item is Complete Data Set, you will be taken to the HEDS Download Complete Data Set page. If the item is Complete Document, the document downloads. Actual downloading processes may vary, depending on your computer setup and software.
Quality Assurance. This link provides information on data use and constraints regarding the data set or document.

The following links are also available.

Search Results. If you have performed a search in EIMS, this link takes you back to the last set of EIMS search results. If you have come into EIMS from HEDS, this link takes you back to the most-recent page in HEDS.
New Search. This link takes you to an EIMS Simple Search Form. This search form is useful for perusing the contents of EIMS for entries under various topics, including HEDS. However, the results of such a search do not automatically provide access to all HEDS functionality if the Partner field is changed from ORD_HEDS.
Administrative Details. This link provides general administrative information about the current entry.
Access Info/Policy. This link provides access information about the entry.
Full Metadata Record. This link provides a print-ready page containing all metadata related to the current entry.

Return to Top

INFORMATION ON DOWNLOADS

CONTENTS OF A DOWNLOAD PACKAGE

Information downloaded from HEDS includes data set metadata (as a readme file), data set files(s), data dictionary file(s), and code set file(s). For a complete data set, these files are provided as a group in a zipped package. For customized data sets, the files are not in a zipped package but are provided as individual files. The following sections describe aspects of downloads.

DATA SET METADATA

Data Set Metadata Sample

DATA SET PACKAGES

General
Importing dBase IV (.dbf) Files
Importing ASCII (.txt) Files
Importing Customized Data Sets
File Naming Conventions
Segments for Complete Data Sets
Customized Data Sets
Important Download Information for Internet Explorer Users

DATA SET

DATA DICTIONARY

CODE SET

DATA SET METADATA

Data set metadata information is taken from the Environmental Information Management System (EIMS), ORD's metadata repository. EIMS metadata can be accessed via the Data Set Metadata link on the HEDS navigation bar or the Study Metadata link on the Study Information Directory page.

Return to Information on Downloads

Return to Top

Data Set Metadata Sample

An example of data set metadata follows.

Data Set Description

Entry ID:

17419

Data Set contains 119 columns, 459 rows, and 1 section(s)

Name:

NHEXAS PHASE I REGION 5 STUDY--DESCRIPTIVE QUESTIONNAIRE DATA

Abstract:

This data set includes responses for 459 descriptive questionnaires. The Descriptive Questionnaire was used to enumerate individuals within a household for sampling purposes (basis for selection of sample individual), to identify general characteristics of the living quarters and occupants, and to provide a basis for assessing potential bias due to refusals in subsequent steps. It includes a few general questions about the household and a set of demographic questions about each full-time resident of the household. Keywords: questionnaire; exposure survey.

The National Human Exposure Assessment Survey (NHEXAS) is a federal interagency research effort coordinated by the Environmental Protection Agency (EPA), Office of Research and Development (ORD). Phase I consists of demonstration/scoping studies using probability-based sampling designs. The NHEXAS Phase I Questionnaires were organized into six modules for simplicity in administration (to minimize respondent burden and maximize participation rates at each step) and for collecting information that can be temporally related to the exposure, concentration and/or biological measurements collected in NHEXAS: Descriptive, Baseline, Technician, Follow-up, Time and activity diary, and Dietary diary (and follow-up). The Region 5 study was conducted in EPA's Region 5 (Ohio, Michigan, Illinois, Indiana, Wisconsin, and Minnesota), and included personal exposure, residential concentration, and biomarker measurements of metals and VOCs. The study was conducted by the Research Triangle Institute (RTI) and the Environmental and Occupational Health Sciences Institute (EOHSI). The scope and design of the study are detailed in the following article: E. Pellizzari et al., Population-Based Exposure Measurements in EPA Region 5: A Phase I Field Study in Support of the National Human Exposure Assessment Survey. Journal of Exposure Analysis and Environmental Epidemiology, Vol. 5, No. 3, 1995, pp. 327-358.

Data Use And Constraints:

These data are the result of a probability-based sampling design specific to the population under study. Thus the data may or may not be representative of subsets of this study's population or of other populations. The study was designed to test certain hypotheses and thus may limit its applicability for other purposes. When using these data it is important to consider the percentage of nonresponses or nondetects in the data as an indicator of its usefulness for other purposes. No liability is accepted by the U.S. EPA for any errors or omissions in the results included in the data set, associated information and/or documentation.

Based on the output format selected by the user and the software into which the data set is imported, the user may notice that measurements and sampling weights are zero-padded to the right of nonzero decimal digits. These zeroes are purely a function of the formatting process and the software's acceptance of that format and should not be construed to represent significant digits in the value. Measurement values are provided with four significant digits; sampling weights are provided with three decimal digits.

Notice:

The current data is draft data and should not be used for any definitive purposes.

Additional Information:

To access the data set, click on the downloads link on the navigation bar. Then click on the download entry "Access Data Set".

Version:

1.0.

Return to Information on Downloads

Return to Top

DATA SET PACKAGES

General

The data set portion of a typical download package contains three types of information for the selected data set: data, data dictionary, and code set. The data dictionary provides characteristics of the data columns in a given data file. The code set provides the map between the code values used for the responses in the data and the descriptions those code values represent.

The download package is a zipped file that can be opened using most unzip software. For information on software to use in unzipping the file, go to the HEDS Home Page and click on the Related Web Sites link. (Return to Contents of a Download Package.)

The individual files are provided in dBase IV format (.dbf) or ASCII format (.txt) and should be importable into most database or analytical software tools. The files follow naming conventions and other specifications that allow importing into a broad range of tools and versions likely to be used. In ASCII files, the columns are delimited by tabs, and columns with text are surrounded by double quotation marks (").

Analytical software packages do not accept these formats in the same way. Some make their own judgments about how to interpret what is included in the file to be imported. Files downloaded from HEDS were imported into Excel 97, Access 97, SPSS 10, and dBase IV. The following notes describe characteristics of the software that should be taken into account in importing the data, data dictionary, and code set files. In most cases, data in the .dbf format imports more easily and with fewer problems than data in the .txt format.

Importing dBase IV (.dbf) Files

Note: For a data set downloaded in .dbf format, two additional files (xxxxxds.prt and xxxxxds.txt) may appear in the unzipped data set. These are files dBase uses to help with printing. They do not affect use of other files.

Excel 97 imports data in the .dbf format without problems. It keeps the numeric formats with the correct number of decimal digits. Excel 97 seems to set a default cell format for all columns as "number" with zero decimal digits.

Access 97 imports data in the .dbf format. It changes numeric columns to double (double precision), and it sets the width of all text fields to 255 characters.

SPSS 10 imports data in the .dbf format without problems, and it carries over format correctly. It adds a first column called d_r that has all blank values.

dBase IV imports data in the .dbf format without problems.

Importing ASCII (.txt) Files

In ASCII files, the columns are delimited by tabs, and columns with text are surrounded by double quotation marks ("). These characteristics have an impact on the importing of .txt files into the four software packages previously noted.

Excel 97 provides a text import wizard. In the successive screens of the wizard, selections should be made in Step 1 for Delimited; in Step 2 for Tab delimiter and double quote ("); and in Step 3 for General as the Column Data Format.

Excel 97 does not carry over the data type. For example, if a column is formatted as character and contains only numeric values, Excel changes the format to numeric. You can correct this problem in Step 3 of the wizard. First, under Data Preview, select the column. Then, under Column Data Format, select Text. Repeat this procedure for other columns as desired before you click Finish. The data dictionary included with the data set will help you determine the format for a column.

Excel 97 does not carry over the column width. Thus, for example, all of the decimal digits of a value may not be visible. To correct this problem, adjust the column widths as necessary in the Excel worksheet.

Access 97 follows its own rules for converting data. It generalizes, not keeping detailed format information. For example, it changes integer to long integer and float type columns to double (double precision). Access 97 shows all of what is in the field in terms of decimal digits and width within the definition of its data type. All character columns are assigned a text data type with a length of 255.

SPSS 10 encounters several errors in importing .txt files. It cannot read the column names, which have double quotation marks around them, and it converts column names to variable names V1, V2, etc. SPSS 10 retains the double quotation marks around character strings in the data cells, a fact that also increases the width of the character fields by 2. It defines the width for numeric fields based on the values in the first couple of rows. It also uses 2 as the default number of decimal digits. You can change these through various SPSS options or as part of the import process. The actual values are maintained, but the format affects what appears on output.

dBase IV will not import ASCII-formatted (.txt) files.

Importing Customized Data Sets

Customized data sets are currently available only in ASCII format. Depending on the software to be imported into, it may be easier to import the files into Excel first and then import them into other software.

Return to Information on Downloads

Return to Top

File Naming Conventions

The following file naming conventions are used. If the HEDS Entry ID number for the data is 12345 and the files are in dBase IV format, then the data set (ds) file has a name like 12345ds.dbf. The data dictionary (dd) file describing this data set has a name like 12345dd.dbf. The related code set (cs) file has a name like 12345cs.dbf.

Return to Information on Downloads

Return to Top

Segments for Complete Data Sets

To fit most software packages and their most-likely-used versions, the files containing the actual data (data set files) have been limited to a maximum of 255 data columns. If the entire data set contains no more than 255 data columns, the download package contains one each data set, data dictionary, and code set. Thus, if HEDS data set 12345 has 1 segment, and if the download is in .dbf format, the following files would be included in this package:

12345ds.dbf	Data set
12345dd.dbf	Data dictionary for the data set
12345cs.dbf	Code set for the data set

However, if a complete data set contains more than 255 data columns, the files in the download package are provided in segments. Each segment contains the same set of record identifying columns at the beginning of each record to enable matching of data from different segment files. The segment number is included as the last character in the segmented file's filename. For example, if HEDS data set 45678 has two segments, and if the download is in .dbf format, the names of the files for the second segment's data set, data dictionary, and code set would be 45678ds2.dbf, 45678dd2.dbf, and 45678cs2.dbf, respectively.

Some users may have software enabling them to merge segmented files into joined files. Therefore, the download package for segmented files also includes a data dictionary and code set for the entire, nonsegmented data set. These data dictionary and code set files for the entire data set do not contain a segment number as the last character in the file name. Note that the package does not include a nonsegmented data set. Thus, if HEDS data set 45678 has two segments, and if the download is in .dbf format, the following files would be included in this package:

45678ds1.dbf	Segment 1 file of the data set
45678ds2.dbf	Segment 2 file of the data set
45678dd1.dbf	Data dictionary for segment 1 data file
45678dd2.dbf	Data dictionary for segment 2 data file
45678cs1.dbf	Code set for segment 1 data file
45678cs2.dbf	Code set for segment 2 data file
45678dd.dbf	Data dictionary for the complete data set
45678cs.dbf	Code set for the complete data set

In this example, the data dictionary for the complete data set would include data columns in the following order: the record identifying columns from segment 1, the data columns from segment 1, and data columns following the repeated record identifying columns from segment 2. Segments are created to keep data columns of a similar category together. This may not reflect the original order of columns in a questionnaire.

Customized Data Sets

All customized downloads from a given data set are assigned standardized filenames in the download process. If you are downloading multiple selections from the same data set, it is recommended that you assign unique names to each selection's files.

For a customized data set of 255 or fewer data columns, one data file, one data dictionary file, and one code set file will be available for download through links on separate pages. The data file will contain the record identifying columns for the data set followed by the selected columns in the order they appear in the full data set.

For a customized data set of more than 255 columns, the files in the download package are provided in sections. (Sections of customized data sets are similar to segments of complete data sets.) The first section will include the record identifying columns followed by the selected columns in the order they appear in the full data set up to a total of 255 columns. The next section will include the record identifying columns followed by the next set of selected columns in the order they appear in the full data set up to a total of 255 columns. The sections do not necessarily keep data columns for categories of information together as in the complete data set download. There is also no data dictionary or code set for the full selection.

It is suggested that if you are selecting more than 255 data columns you download the complete data set package.

Important Download Information for Internet Explorer Users

The Internet Explorer (5.0+) browser does not process the customized download files in the same manner as the Netscape browser. It is recommended that, if you are using Internet Explorer (5.0+), you use one of the following methods to download the customized files.

Method 1

On the navigation bar, under Customized Data Set, click Data Set. The Download Customized Data Set page appears.
Right-click on the Download Customized Data Set in ASCII Format link. This brings up a menu box. In the menu box, click Save Target As. This brings up the Save As dialog box.
(1) In the File Name field, change the name of the file to be saved to something meaningful for you. (2) In the Save As Type box, choose text as the Save As type.
See the paragraph beginning For either method, below.

Method 2

On the navigation bar, under Customized Data Set, click Data Set. The Download Customized Data Set page appears.
Click on the Download Customized Data Set in ASCII Format link. A window appears containing the download file.
Right-click on the area where the download file appears. This brings up a menu box. In the menu box, click View Source. This opens a Note Pad window containing the download file.
Click File, Save As. This brings up the Save As dialog box.
(1) In the File Name field, change the name of the file to be saved to something meaningful for you. (2) In the Save As Type field, select text.
See the paragraph beginning For either method, below.

For either method, at this point it is necessary to edit the saved file before using it. Open the saved file into your preferred editing software. You will see tab delimited data fields with double quotation marks (") around columns with text preceded by some text like: Content-Type: text/tab separated values, and Content-Disposition: attachment; filename=17420cs1.txt. This text will precede each section of data within the file. Each section must be saved as a separate file without the header in order to import it to other software.

If you use Internet Explorer, it may be easier to download the complete data set.

See Importing Customized Data Sets for further information.

Return to Information on Downloads

Return to Top

DATA SET

HEDS data sets fall into three general categories: questionnaires, analytical results, and QA analytical results. Because studies may differ in the information collected, one format cannot be used for all data sets in a category. However, some consistency in approach has been followed.

In a data set containing questionnaire responses, one row of the data file represents one participant's responses. If the questionnaire is administered once to each respondent, then each row in the data file represents one participant's responses. If the questionnaire is administered to a participant more than once, then each row represents a participant's responses on a particular date.

In a data set file, the first set of data columns contains columns for participant identification, columns relating to the sampling design of the study, columns for sampling weights related to a probability-based sampling design if used, columns with dates relating to the administration of the questionnaire, and columns with miscellaneous record information. These are the record identifying columns, and they are repeated in any segment or section. The remaining data columns provide the responses to the questions in the order they appear in the questionnaire. At the end of all the response columns, there may be additional columns containing calculated data.

In many studies, the names of the data columns reflect the question number. To facilitate the use of these data files in multiple versions and types of software, the column names have been limited to eight characters beginning with an alphabetic character. The .dbf or .txt files include the column names and the data.

A data set containing analytical results includes results for a similar set of analyzed samples, for example, metals in air or volatile organic compounds (VOCs) in water. The organization of analytical results across a study's data files is study-dependent. Each record in an analytical results data file represents all the results from one analysis of a sample. In most cases, the sample ID uniquely identifies the rows; in some cases, additional data columns may be required. (See the Primary Key column in the data dictionary.) The first group of data columns includes (1) various basic information associated with the study, such as sample ID, sampling medium, analyte (chemical) class, and participant ID; and (2) information related to the study's sampling design and any probability-based sampling weights. These are the record identifying columns, and they are repeated in any segment or section. The remaining data columns include the analytical results in groups of columns organized by analyte. One analyte's group of columns might include the concentration, the units of concentration, a detection or quantitation limit, a data quality flag, and a comment about the result for this sample. The data file from a metals analysis for a set of samples might include a group of columns each for lead, cadmium, chromium, and arsenic. Additional columns might include comments or ancillary information on the sample. The column names have been limited to eight characters beginning with an alphabetic character, and the .dbf or .txt files include the column names and the data.

A data set containing QA analytical results will include results only for a specific type of QA sample (blanks, duplicates, spikes) for an analyte class. The organization of QA analytical results across a study's data files is study-dependent. The organization of data columns in these data files is similar to that in the analytical result files described above. In addition these data files contain columns specific to the type of QA analyses included in the file, such as type of duplicate (field or analytical), percent recovery for spikes, and the sample ID for a replicate sample, if applicable. These analytical results are also grouped in columns by analyte. The column names have been limited to eight characters beginning with an alphabetic character, and the .dbf or .txt files include the column names and the data.

Return to Information on Downloads

Return to Top

DATA DICTIONARY

The data dictionary provides characteristics of the data columns in a given data file. Each data dictionary file contains the same columns of descriptors. Each row in a data dictionary file provides information on one of the data columns in the data file. The rows in the data dictionary are in the same order as the data columns in the data file. The columns in the data dictionary are as follows:

COL_NAME -- column name, max 8 characters (see Notes below)

COLLABEL -- column label, max 40 characters (see Notes below)

EXT_DESC -- extended column description, max 250 characters (see Notes below)

DATATYPE -- column data type [number, integer, character, or string (alphanumeric)]; max 9 characters (NOTE: date is not an acceptable data type for HEDS. A predefined numeric format is used for dates and years. See Notes below.)

COLWIDTH -- maximum number of spaces taken up by data column, max 4 digits

COL_FMT -- column format, specifies a format for reading and printing the data in the column (see Notes below)

UNITS -- units, max 10 characters (blank if units are specified in a separate column)

PRMRYKEY -- primary key, max 1 character, contains "Y" or "P" if column is a primary key for the data file, blank otherwise

MINIMUM -- minimum permissible value, max 20 characters, present for numeric data column, does not include nonresponse code values. Permissible values are the values included in the data column that are considered usable. (See Notes below.)

MAXIMUM -- maximum permissible value, max 20 characters, present for numeric data column, does not include missing value codes. Permissible values are the values included in the data column that are considered usable. (See Notes below.)

MISSVALS -- list of the study-defined missing value or nonresponse codes for this column, separated by a semicolon (;) or a comma (,), max 40 characters, present if data column contains any nonresponse codes. Some analytical software tools, such as SAS and SPSS, allow a user to define missing values for a data column. This software option enables a user to include or not include in an analysis any rows that contain the assigned missing values.

COL_NUM -- a number that represents a relative ordering of the columns for the data file, not data set, associated with the data dictionary. This ordering will differ depending on whether the data file is a segment of the complete data set or a customized data set.

WT_NAME -- list of weight names associated with this data column, max 100 characters, present if study uses probability-based sampling weights and if weights can be applied to the data column. These names are links to study documentation describing the creation and use of weights.

WT_TYPE -- type of weight to be used for this data column, if more than one level of response, e.g., household and participant, is included in the data file, max 10 characters, not required but included if study uses probability-based sampling weights and if weights can be applied to the data column.

COMMENTS -- additional information about the data column, max 80 characters

CHECKSUM -- sum of all the values in the data column to be used as a check that all records in the data file have transferred correctly; a number, where size depends on data column; available for most numeric columns

Notes

The following notes relate to the columns in a data dictionary file.

COL_NAME, COLLABEL and EXT_DESC are progressively detailed ways to describe what is contained in the data column. COL_NAME should be able to transfer with the data to provide the column name in most analytical software. Some software, such as SPSS and SAS, may provide options for using the COLLABEL. For both COL_NAME and COLLABEL, however, the number of characters allowed may not be adequate for a good description of the column's contents. Thus the extended column description is available and includes the actual phrasing of the question from a questionnaire, where available.

COL_FMT describes any formatting definitions for a data column. Decimal values include the decimal point. Specifications for COL_FMT are as follows:

Number data type -- COL_FMT has a value in the form Fw.d, (e.g. F7.3). This indicates that the numbers in the associated data column are formatted as decimal numbers with a maximum width of w and d decimal digits, where w indicates the maximum number of spaces taken up by the value, and d indicates the number of digits to the right of the decimal point. Each character, including the decimal point and the minus sign (for negative numbers), count as one space. Thus, for a value of F7.3, the numbers in the data column would look like 123.456. For a negative number with a value of F7.3, the numbers in the data column would look like -12.345. In conversions of HEDS data to the ASCII and dBase formats and subsequent importation of the data into a particular software package, some data may appear to have trailing zeroes. Note that most analytical results data are formatted to three significant digits. Other data, such as sampling weights, are formatted to have a specific number of decimal digits as noted in the data dictionary.
Character data type -- COL_FMT has a value in the form Aw (e.g., A10). A value of A10 indicates that the values in the associated data column are alphanumeric with a maximum of 10 characters. Values in this format are usually left-justified in the field.
Date data types -- No Date data types are used. Dates and years are stored as number data types and thus have no separators between year, month, and day. A date appears in the format yyyymmdd, where yyyy represents the year, mm the number of the month (01-12), and dd the day of the month (01-31). A year appears in the format yyyy.

MINIMUM, MAXIMUM, and MISSVALS help in understanding what to expect from the data in a column by specifying the end points of the permissible value range and the code values for nonresponses. Available only for numeric data columns.

The data dictionary .dbf or .txt files include the column names and the data dictionary information.

Return to Information on Downloads

Return to Top

CODE SET

Code sets are a shortcut way of including responses in a data set. For example, a single number or character can be defined as a placeholder for the full response. Thus the number 1 can be assigned for the response "male," and the number 2 for "female." The numbers 1 and 2 are the code values; male and female are the code descriptions.

The code set file contains the code sets for all the data columns in the associated data file. A code set is available for each data column that contains code values, whether those code values represent descriptive responses or nonresponse categories. (Nonresponse categories include responses like "missing," "don't know," and "not applicable." Some software packages allow handling of nonresponses differently than permissible values when processing the data.) The code set for a given data column is identified by the name of the data column using it. If several columns use the same code set, a code set is included for each with the data column name as identifier. Each row in the code set contains information for a unique code value in a code set, including nonresponse codes. The code set file has the following format:

COL_NAME -- name of data column to which code values and descriptions apply, max 8 characters

CHAR_VAL -- code value for code set in character format, max 10 characters; exists for all code values

NUM_VAL -- code value for code set in numeric format; available only for numeric data columns

SHRTDESC -- short description associated with code value and for use in software package labeling, max 20 characters

EXT_DESC -- extended description associated with code value, max 250 characters, available when short description is not adequate for understanding

The code set .dbf or .txt files include the column names and the code set information.

Return to Information on Downloads

Return to Top

Office of Research & Development | National Exposure Research Laboratory
Send questions or comments to Carry Croghan,
Webmaster at Croghan.Carry@epa.gov

How to Use HEDS -- Reference

Local Navigation