<%@LANGUAGE="JAVASCRIPT" CODEPAGE="65001"%> NIST Speech Group Website
Information Technology Lab, Information Access Division NIST: National Institute of Standards and Technology


  • Speech Group Home
  • Benchmark Tests
  • Tools
  • Test Beds
  • Publications
  • Links
  • Contacts
  • Data Collection Facility

    NIST has constructed a Meeting Data Collection Laboratory to collect corpora to support research, development and evaluation in meeting recognition technologies. It is equipped to look and sound like a conventional meeting space. As such, sensors have been inconspicuously placed and noisy processors have been located outside of the room. With a few exceptions, the room is equipped with commercially available microphones and video cameras. However, what makes the facility truly unique is that all of the sensor streams are digitally sampled in real time and time-synchronized to a local, central clock using the Network Time Protocol (NTP) time signal. This permits the data collection system, in theory, to be plugged directly into realtime recognition systems. We believe this approach provides a good prototype of the front-end sensor systems that might be found within future Smart Meeting Rooms. This facility was used to collect the NIST Meeting Pilot corpus.

    Data Streaming and Synchronization

    The Meeting Data Collection Laboratory facility contains a variety of microphones and several video cameras which are all transferred on the network using the NIST Smart Data Flow architecture developed by the NIST SmartSpace laboratory and post-recording synchronized using a file-format that saves a time stamp for each sample recorded, and the Network Time Protocol (NTP) standard via the NIST atomic clock signal and a local computer acting as a 1-hop NTP server for all capture nodes. This unique capability permits the implementation of experiments that compare automatic recognition/extraction performance using different sensors or multiple sensors in tandem.

    Room Layout

    The room has been configured as a conventional meeting space and the sensors have been placed so as to be as unobtrusive as possible. The room is approximately 22 X 22 meters and can be configured for a variety of meeting forums (conference table, round table, classroom). However, all of the meetings in the pilot corpus were collected using the conference table configuration shown in the diagram.

    Video Capture

    The data collection facility includes 5 Sony EVI-D30 video cameras, 4 of which have stationary views of a center conference table (1 view from each surrounding wall) with a fixed focus and viewing angle, and an additional "floating" camera which is used to focus on particular participants, whiteboard, or conference table depending on the meeting forum. This data is captured in a NIST-internal file format . See the Source Data page for more details on the video format.

    The facility also employs Camtasia Studio for screen capture for when the room's PC/projector is used. This capability was added late in the data collection cycle, so only a few of the pilot corpus meetings include this data. Camtasia Studio uses its own, private video format. The captured data can, however, be converted to Motion-JPEG A and then to MPEG-2 for distribution.

    Audio Capture

    The data collection facility includes several types of microphones ranging from wireless close-talking head microphones to experimental microphone arrays. All commercial microphones are collected using a 24-channel A/D.

    Commercial Mics

    Each meeting participant is equiped with 2 worn "personal" microphones (1 noise-cancelling headset mic and 1 directional lapel mic). Both personal mics are wireless, so participants are free to move about the room. The above A/D can support up to 8 sets of personal mics. In addition to the personal microphones, the table, when set up in conference configuration, is equipped with 4 table-mounted microphones: 3 omni-directional boundary mics are positioned at the center and ends of the table and a 4-channel directional boundary microphone rests at the center of the table. A speaker phone system with audio tap is also available.

    All commercial mics were collected at a 48KHz sample rate with a 24-bit sample size. The data were SPHERE-encoded and downsampled to 16KHz/16-bit for distribution.

    Linear Array Mics

    Three experimental Mark-II 59-channel linear array microphones developed by the NIST Smart Spaces Lab are positioned on pole mounts against the front and side walls of the room. Each mic array channel was collected at 22050Hz with a 16-bit sample size. Unfortunately, however, because of technical problems in data collection, this data was not distributed.

    Room layout

    room layout

    All pilot corpus meetings were collected using this configuration. Note: the electronic whiteboard was introduced half-way through the collection of the pilot corpus.

    Back to top

    Back to top

     

     

     

    Page Created: September 19, 2007
    Last Updated: February 8, 2008

    Speech Group is part of IAD and ITL
    NIST is an agency of the U.S. Department of Commerce
    Privacy Policy | Security Notices|
    Accessibility Statement | Disclaimer | FOIA