<%@LANGUAGE="JAVASCRIPT" CODEPAGE="65001"%> NIST Speech Group Website
Information Technology Lab, Information Access Division NIST: National Institute of Standards and Technology

  • Speech Group Home
  • Benchmark Tests
  • Tools
  • Test Beds
  • Publications
  • Links
  • Contacts
  • Pilot Corpus - Phase 1

    The NIST Meeting Room Pilot Corpus consists of meeting data collected between 2001 and 2003. The pilot corpus includes 19 meetings of various forms and consists of approximately 15 hours of data recorded simultaneously from multiple microphones and video cameras.

    The Pilot Corpus was collected using a specially-constructed realtime multi-media data collection facility. A variety of meeting forums and scenarii were employed to elicit a variety of meeting interaction types and vocabularies. In addition to digital multi-media recordings, the corpus includes a set of ancillary meta data and transcriptions. The process of collecting multi-media data with multiple participants in the context of a complex hardware environment has been a learning experience.

    The NIST Meeting Room Pilot Corpus (pdf version) was presented at LREC 2004.

     

     

    Page Created: September 19, 2007
    Last Updated: December 19, 2007

    Speech Group is part of IAD and ITL
    NIST is an agency of the U.S. Department of Commerce
    Privacy Policy | Security Notices|
    Accessibility Statement | Disclaimer | FOIA