Sustainability of Digital Formats
 Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

HDF4, Hierarchical Data Format, Version 4 and earlier

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name HDF4, Hierarchical Data Format, Version 4 and earlier
Description At its lowest level, HDF is a physical file format for storing scientific data. The data structure types that HDF supports are Scientific Data Sets, Raster Images (General, 8-bit, 24-bit APIs), color palettes, text entries, and Vdatas and Vgroups.
  • Scientific Data Sets (SDSs) are used for storing n-dimensional gridded data. The actual data in the dataset can be of any of the "standard" number types: 8, 16, and 32-bit signed and unsigned integers, and 32 and 64-bit floating point values. In addition, the SD interface allows SD data sets with variable bit lengths (1 to 32-bits) to be created. Metadata such as dimension scales and attributes can also be stored with an SDS.
  • Vgroups are generic grouping elements allowing a user to associate related objects within an HDF file. As Vgroups can contain other Vgroups, it is possible to build a hierarchical file. Vdatas are generic list objects. By combining Vdatas in Vgroups, it is possible to represent higher level data constructs: mesh data, multi-variate datasets, sparse matrices, finite-element data, spreadsheets, splines, non-Cartesian coordinate data, etc.
At its highest level, HDF is a collection of utilities and applications for manipulating, viewing, and analyzing data in HDF files. Between these levels, HDF is a software library that provides high-level APIs and a low-level data interface.
Production phase Generally used for middle- and final-state archiving.
Relationship to other formats
    Has subtype Includes version 4.x and previous releases not documented separately here.
    Affinity to HDF5, Hierarchical Data Format, Version 5

Local use Explanation of format description terms

LC experience or existing holdings None
LC preference None

Sustainability factors Explanation of format description terms

Disclosure The HDF software was developed and supported by NCSA and is freely available. In July 2005, NCSA announced that the "Hierarchical Data Format group is spinning off from the National Center for Supercomputing Applications (NCSA) as a non-profit corporation supporting open source software and non-proprietary data formats."

Source code for the HDF libraries is available in Fortran and C. Some tools are available as Java source.

    Documentation http://www.hdfgroup.org/products/hdf4/
Adoption These freely available tools are used by an estimated 2 million users in fields from environmental science to the aerospace industry and by entities including the U.S. Department of Energy, NASA, and Boeing. It is used world-wide in many fields, including Environmental Science, Neutron Scattering, Non-Destructive Testing, and Aerospace, to name a few. Scientific projects that use HDF include NASA's HDF-EOS project, and the DOE's Advanced Simulation and Computing Program.
    Licensing and patents None.
Transparency TBD.
Self-documentation An HDF structure is self-describing, allowing an application to interpret the structure and contents of a file without any outside information. Supports user-defined attributes and annotations.
External dependencies None.
Technical protection considerations None.

Quality and functionality factors Explanation of format description terms

Dataset
Normal rendering Normal rendering for datasets not established yet.

File type signifiers Explanation of format description terms

Tag Value Note
Filename extension hdf
From The File Extension Source.
Internet Media Type application/x-hdf
From The File Extension Source.
Magic numbers Hex: 0E 03 13 01
From The File Extension Source.

Notes Explanation of format description terms

General

There are two HDF formats, HDF (4.x and previous releases) and HDF5. These formats are completely different and NOT compatible. As of September 2005, there are no plans to drop support of HDF, but features will not be added. New projects are encouraged to use HDF5.

Some of the HDF (4) limitations are: A single file cannot store more than 20,000 complex objects, and a single file cannot be larger than 2 gigabytes; the data models are less consistent than they should be. There are more object types than necessary, and datatypes are too restricted; the library source code is old and overly complex, does not support parallel I/O effectively, and is difficult to use in threaded applications.

History

The HDF Group will be spinning off from the National Center of Supercomputing Applications (NCSA) as a non-profit corporation. The new corporation, "The HDF Group" (THG), will continue to support open source software and non-proprietary data formats. This move is expected to take up to six months to complete.

A new web site has been set up for THG at: http://www.hdfgroup.org/


Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 02/20/2009