NIST Scientific and Technical Databases NIST Scientific and Technical Databases NIST Homepage Databases
Data Home

Analytical Chemistry

Atomic and Molecular Physics

Biometrics

Biotechnology

Chemical and Crystal Structure

Chemical Kinetics

Chemistry

Communications

Construction

Environmental Data

Fire

Fluids

International Trade

Law Enforcement

Materials Properties

Mathematical Databases, Software and Tools

Optical Character Recognition

Physics

Product Design

Surface Data

Text and Video Retrieval

Thermophysical and Thermochemical

 

thin vertical line

NIST Special Database 6

NIST Structured Forms Reference Set of Binary Images II (SFRS2)

Price: $90.00 Link to the Online Purchase Order FormLink to a FAX or Mail Order Form

The second NIST database of structured forms consists of 5,595 pages of binary, black-and-white images of synthesized documents containing hand-print.

The documents in this database are 12 different tax forms with the IRS 1040 Package X for the year 1988. These include Forms 1040, 2106, 2441, 4562, and 6251 together with Schedules A, B, C, D, E, F, and SE. Eight of these forms contain two pages or form faces; therefore, there are 20 different form faces represented in the database.

The document images in this database appear to be real hand-printed forms prepared by individuals, but the images have been automatically derived and synthesized using a computer and contain no "real" tax data. There are 900 simulated tax submissions represented in the database averaging 6.22 form faces per submission.

This database totals approximately 5.95 gigabytes of uncompressed image data including image format documentation and example software.

The database has the following features:

  • 900 simulated tax submissions
  • 5,595 images of completed structured form faces containing hand-printed data
  • 12 pixel per millimeter resolution
  • 5,595 text files containing entry field answers
  • 20 tables of entry field types and contexts
  • image format documentation and example software

Suitable for both document processing and automated data capture research, development and evaluation, the database can be used for:

  • forms identification
  • field isolation: locating entry fields on the form
  • character segmentation: separating entry field values into characters
  • character recognition: identifying specific handprinted characters.

The database is a valuable tool for measurement of system performance and system comparison on complex forms.

You may browse the Users' Guide to see how this database works.

Please click here to view the PDF version of Users' Guide.

System Requirements: CD-ROM drive with software to read ISO-9660 format.

Price: $90.00 Link to the Online Purchase Order FormLink to a FAX or Mail Order Form

Special pricing for multiple copies available. Call for details.

For more information on Special Database 6 please contact:

Standard Reference Data Program
National Institute of Standards and Technology
100 Bureau Dr., Stop 2300
Gaithersburg, MD 20899-2310

(301) 975-2008 (VOICE) / (301) 926-0416 (FAX) / Contact Us

The scientific contact for this database is:

Michael Garris
National Institute of Standards and Technology
100 Bureau Drive, Stop 8940
Building 225, Room A216
Gaithersburg, MD 20899-8940
(301) 975-2928
michael.garris@nist.gov

Keywords: ASCII Reference; Binary Image Database; character recognition; hand print; hand printed characters; NIST; OCR; optical character recognition; software recognition; synthesized documents; tax forms.


[Online Databases] [New and Updated Databases]
[Database Price List] [JPCRD] [CODATA] [FAQ] [Comments] [NIST] [Data]

Create Date: 6/02
Last Update: Wednesday, 21-May-08 11:26:38
Contact Us