skip navigation  The Library of Congress >> Research Centers
AFC Logo The American Folklife Center
A - Z Index
home >> collections and research services >> folk heritage collections in crisis >> keynotes

Folk Heritage Collections in Crisis

Table of Contents

Preservation of Audio

Elizabeth Cohen
Cohen Acoustical, Inc.

The goal of this paper is to be the strawman for galvanizing the preservation of recorded folk collections. So in that spirit, I throw down the gauntlet or perhaps kickstart is a better term: The key to preservation is distribution. The challenges of preserving recorded folk collections have more to do with biting the bullet to get the digitization of the collections going than with inadequate technology. In other words, the challenges are mostly aesthetic and in the analog domain. The search for the perfect technical solution is frankly a diversion from the painstaking work and art of transfer. Furthermore, if anything, it is budgetary and acoustopolitical issues that are hampering our progress in doing what must be done: migrating the collections into the digital domain. A corollary to this is: migrate the collections into the digital domain with uncompromised fidelity.

Let me digress with an anecdote. A few months back, I spoke with a curator who was deeply concerned about the problems in preserving audio information that continue to paralyze us:

Preserving original materials

Obsolescence of playback machinery

Human readability

Defining what is a faithful copy

The discussion took the form of the litany, "Computers are unreliable, the risk of loss is too great, mediums change too quickly, the costs are too high, we can't even play back our (digital audio tapes, VHS, beta, 8 tracks...) from two years ago. We need our temperature- and humidity-controlled storage for original tapes, media with 100-200 year life expectancy, invincible encryption, unlitigatable no thrash copyright access." Feeling mischievous, I asked, "What controls the thermostats in your intelligent, climate-controlled conservation room?" My question was met with silence, and I had to suggest that in all likelihood it was a microprocessor: one of those pesky computers was leading the chain of command and was fundamentally responsible for preservation. To be practical, we must recognize that there is no escaping the role and rule of bits in the preservation of recorded folk collections. The first hurdle is to recognize the absolute integration of the computer and the computer network into twenty-first century life.

Although it may fall to some of us to deal with the "what ifs" of an electromagnetic pulse tragedy, our collections are far more likely to survive the scars of mayhem if they are robust and alive in many hands.  Moreover, to delay in the transfer of analog media into the digital domain is to compromise preservation. The more time that passes, the more we allow the further degradation of analog materials.

Distribution is the key to preserving audio folklore collections in the twenty-first century. In fact, distribution is preservation. Moreover, it is the type of preservation that keeps the art alive and not sterilized behind or enshrined in a glass case in a passive museum setting. Fortunately, in the networked world, distribution is becoming both easier and cheaper. Our technical issues have shifted to studying the best methods of providing efficient access. Do we want multiple server nodes where folklife information is stored or a single location for a master server farm?[1] Will libraries become "storage service provider" utilities or will they lease space on new "electric company-like" utilities? 

In the networked world, information can be maintained and distributed electronically; it no longer needs to be kept in a central location. Archives may no longer need to secure information in vaults. Collections may be located in a thousand places. Digitization forces a paradigm change. Librarians are used to thinking that when you make copies, it is not the real thing. The cult of the original was powerful in the world of analog recording, where information was lost with each generation. Today, however, the original material may be preserved in its pristine form anywhere and everywhere.

The good news is that none of the choices mentioned above are in the realm of rocket science; the bad news is that we must still contend with the warped strands of technophobia and politics. Sighing more than Al Gore in a debate, I find it painful to listen to the liberal archivists' search for the Holy Grail medium that will never decay and for which they will never have to maintain machines. The sighs turn into groans when I hear the conservatives launch into another paean to analog tape as the only medium we can trust.  These polar beliefs are evidence of an unwillingness to face the task of conversion into the digital domain.

There is no choice but to accept that data migration is the only intelligent policy. We know how to do this to exquisitely fine resolution; banks do it every day. Computer companies upgrade with every significant revision of code, and when the hard disk is full—or when cheaper and faster storage capacity is available—they do one thing: copy and transfer their data. Consumers do it: they have moved from 78's to LPs, from CD's to DVDs, and so on to streaming and downloading MP3's. The lifespan of consumer physical digital media is estimated to be five years or less (National Audio-Visual Conservation Center 2000). We do not know what the recording medium of choice will be in 10 years, not to mention 20, but we do know that it will facilitate the transmission of, access to, and storage of bits. Therefore, it is necessary to adopt a device-independent policy for the migration of digital audio data based on robust error correction capability.[2] The archival modality must have enough depth to render uncompromised audio quality. Today, for this stage of migration, we are assuming capture at 24 bit, 192 kilosamples.[3]

Folklorists must remain vigilant and acquire the budgets for flawless transfers. All the original information must be retained. There is no scientific reason for loss of quality; only sloppiness or "value engineering" can intervene.

Our thesis is in that all digital audio materials should be preserved through migration before the decay of the built-in error correction. As long as one operates within the error correction envelope, it is possible to restore, copy, and preserve indefinitely the original material with no loss of information. Error correction also makes it feasible to detect degradation before there is any information loss. Both standard algorithms and flagging devices already exist to flag, detect, and correct information loss.

With this knowledge, it is possible to establish a policy for data migration of digital audio materials. This will enable curators to plan for the data migration necessary in the age of digital audio. It will also prevent the growing intractability of our audio archiving problems.

Why We Can't Afford to Dawdle

One hundred years of sound recording has left us with a legacy of the equivalent of 5 petabytes of professionally recorded audio. Libraries are already overwhelmed with preserving everything from cylinders to vinyl. They are drowning in a preservation crisis as they continue to accumulate media in extinct formats and as audio materials proliferate at a pace they are unable to keep up with.

But there is no mercy; J.A. Moorer of Sonic Solutions estimated that we are distributing terabytes (TB) of new garage band music every day (Moorer 2000). Three million new Web pages appear each day and a growing percentage include streaming audio (Forbes Oct. 2, 2000 p. 148). Currently, 4,271 radio stations "broadcast" their signal on the net, up from 2,615 stations a year ago, and up from a mere 56 in 1996 (BRS Media Inc. 2000). One-quarter of the American (57 million) population has listened to Internet audio; 20 percent (45 million) listen to radio stations online and 13 percent (30 million) listen to Internet-only audio.[4]  Information appliance companies initiating music delivery to phones, personal digital assistants and into an array of portable entertainment devices. Lest you think that 64 kilobit audio is the sole character generator[5] that is stimulating the data storage industry, the surround sound community is creating its own info-rich recordings. With the standard sample rate shifting to 192/96 kHz, 24 bit, and 4.76 gigabytes of AV data per DVD, multichannel audio is swelling the data banks as well. As FedEx Chief Information Officer Robert Carter is quoted as saying, "There is this tidal wave of storage demand coming at us" (Lyons 2000: 146).

In the mid-nineties, I wrote about the likely appearance of unlimited and ubiquitous bandwidth in my arguments against adopting non-transparent compressed audio for new systems such as HDTV. Last month, EMC Senior Vice President James Rothnie was quoted in Forbes as saying that by 2005; the world's bandwidth could grow a million-fold, making it, "virtually free and virtually infinite." Storage, he believes, will follow suit. He estimated that the total capacity sold annually could grow fifty-fold in five years, from 200 to 10,000 petabytes—enough to hold the text of 500,000 Libraries of Congress (Lyons 2000: 153).

Storage Media Choices for Customer Use, Interim Storage, and Preservation

The good news is that data storage is getting both cheaper and more space-efficient. According to Forbes magazine, disk density has nearly doubled every fifteen months for the past five years, while the cost per megabyte has fallen 52 percent every year during that same period (Goldman 2000: 152). Today's 3.5-inch drives are almost 600 times denser than the 14-inch mainframe drives of the 1980s. IBM's Ultrastar 72ZX holds 73 gigabytes, enough room for every original Frank Sinatra song ever recorded or all of Steven Speilberg's movies on DVD (Goldman 2000: 152). And we are rapidly approaching storage capacity of one terabit, or 125 gigabytes (1000/8), per square inch.

The cost per megabyte of storage capacity has decreased from about $30 in 1987 to $0.005 today. Even more remarkable is the decrease in the size of disk drives. Last summer, IBM released a 1-gigabyte Microdrive for $499.00. This Microdrive has the dimensions of a matchbook and weighs less than one ounce. According to IBM and as reported by Daniel Lyons in Forbes, its spinning platter, the size of a quarter, can hold the equivalent of 18 compact disks. IBM aims to double the storage capacity of this Microdrive every 12 to 18 months. To date, manufacturers of digital cameras, personal digital assistants, and 2 MP3 players have adopted it.

Current Practice: Tape

Magnetic tape seems to be the interim, if not archival, system currently used for digital storage. Business systems include Exabyte Mammoth-2, Quantum DLT (Digital Linear Tape) 8000, Linear Tape Open, and Sony AIT-2.  Tape technology is derived from two branches: helical and linear tape. The former is heir to higher density and performance, where the latter pledges greater reliability.

Many studios are using Exabyte tape drives for a wide range of audio archiving purposes including backup, data transfer, and preservation tasks.  I have been told that Abby Road has more than 2,500 Exabyte tapes. Individual musicians are using both the 8mm Exabyte tape and the Mammoth M2 225m tape cartridge formats. For dealing with interim exigencies, Exabyte tape offers the following advantages:

Error correction. Mammoth-2, for instances, uses a two-level Reed-Solomon Error correction Code (ECC). Exabyte's ECC corrects errors on the fly by rewriting the blocks within the same track.

Data-grade tape, such as AME, which stores more data per cartridge. Its anti-corrosive properties improve tape durability and reduce tape wear, allowing the media to achieve a 30-year archival rating.

Depending on the Exabyte system, reliability ranges between 250,000 and 500,000 hours. This is measured in mean time between failures (MTBF).  The higher the number of hours, the more reliable the drive.[6]

Universal's Mastering Studio's Paul West is currently using Sonic Solutions archiving software on Exabyte tape and then transferring the content to his mainframe system and onto DDP.

However useful as a transfer medium, some users shudder when thinking of Exabyte as an archival medium. One user commented, "it seems you can sneeze and lose a file." Be aware that the copying times for Exabyte's drives is only two years. From a librarian's point of view, it is a device that is available only from one company that is extremely vulnerable to the vicissitudes of the stock market. On the other hand, if distribution is preservation, then it is a transfer medium with a potential 30-year lifespan.

Sony Music, under the leadership of David Smith and Malcolm Davidson, has begun transferring Sony Music's assets into its digital audio archives using an automated tape library system. The archive system consists of a Sun Enterprise 450 server connected by SCSI to SONY DTF tape drives integrated into an ADIC AMI/E automated media server.

The design of Sony's ADIC Automated Media Library is based upon the goal of "Infinite File Life," which allows "systematic monitoring and timely replacement of media, with secondary copies, or complete transfer to new technologies" (ADIC, Inc. 1999). Sony is able to evaluate automatically the quality of the backup tape before it deteriorates. Each cartridge is evaluated on a regular basis by looking at the raw error rates. If the raw error rates grow over time, then they can make an exact copy of the tape and delete the old one. With 600 TB of data on 200,000 cartridges, there was no choice but to automate the error correction (ADIC, Inc. 1999).

Current Practice: MO Disks

Magneto optical (MO) disks may play a role in systems of audio preservation and distribution. They are less expensive than hard disk drives and can provide between 20 and 40 years of viable storage. Future blue laser[7] MO will quadruple the amounts of storage capacity.

The Audio Engineering Society has just released its Standard for audio preservation and restoration-Method for estimating life expectancy of magneto-optical (M-O) disks, based on effects of temperature and humidity. To develop this standard, a sampling of 80 disks was baseline tested for byte error rate (BER). The standard gives a graph that can be used to estimate the time for a given percent of disks to fail.

Preservation Strategies

The development of successful preservation strategies will require the cooperation of computer scientists, data storage experts, data distribution experts, fieldworkers, librarians, and folklorists. There needs to be a transfer of technology from the information storage and transportation businesses into the folklife domain. Banking, security, and critical services industries all have dealt with the issue of preserving vital information. We must draw on their experience in developing policies of backup and redundancy, and in addressing human interface issues.

The Research and Development Agenda

We need to work with research and development efforts across a variety of disciplines. For instance, exciting work is being done in haptic simulation, which some day will allow us to virtually touch and work with virtual objects. We will be able to receive tactile feedback in playing virtual machines or musical instruments.

In conclusion, we have examples from other industries on how to archive. There are no technical barriers to archiving. The technical aspects of this "problem" have been solved. Capitalism is providing cheaper, faster, and more reliable modes of storing, accessing, and distributing audio. A social decision must be made to migrate materials into the digital domain or it will undoubtedly be done without the aesthetic guidance of the folklife community as the record companies have discovered while they waited for the "ultimate" technology.  The genie is already out of the bottle.  If you want a voice, its time to do the work not just talk. 


Notes

[1] A server farm is a collection of many computers dedicated as servers.

[2] Error correction is a well-developed technology that enables detection of signal degradation and enables the user to take action before vital information is lost.

[3] Some may think that 192 kilosamples is overkill but there is a cohort of researchers and musicians who believe that inaudible harmonics may affect brain function. Although I question whether the research is very credible, I believe we would be "better safe than sorry."  Four times oversampling is a spit in the ocean of bandwidth.

[4] http://www.arbitron.com/downloads/broadband.pdf [Arbitron/Coleman. 2000. The Broadband Revolution: How Superfast Internet Access Changes Media Habits in American Households. Available from http://www.colemanresearch.com/broadband.pdf?]

[5] Character generator is a generator of bits, often alphanumeric characters.

[6] http://www.exabyte.com/home/newsinfo.html

[7] Blue lasers have shorter wavelengths so you can focus in to tighter spots and therefore you can store more information on optical media


References

ADIC, Inc. 1999. Preserving Musical History: Digital Asset Management for Intellectual Property at Sony Music, Inc. Available from http://www.thic.org/pdf/Jul97/emass.pkoliopoulos.pdf

BRS Media, Inc. 2000. BRS Media's Web-Radio report strongest growth segment

Press Release Sept. 20, 2000. Available from http://www.brsmedia.com/press000920.html.

Goldman, Lea. 2000. 2000. "Driving Hard, Driving Fast." Forbes (Oct. 2).

Lyons, Daniel. 2000. "Boom!" Forbes (Oct. 2).

Moorer. J.A. Personal communication with author, September 28, 2000.

National Audio-Visual Conservation Center. 2000. Planning Culpeper's Digital Archives. An Overview of Digital Planning and the Digital Prototyping Project. Available from  http://www.loc.gov/rr/mopic/avprot/avldocs.html

  Back to Top

 

  home >> collections and research services >> folk heritage collections in crisis >> keynotes

A - Z Index
  The Library of Congress >> Research Centers
  April 27, 2005
Contact Us:
Ask a Librarian