Digital Preservation and Copyright
If all information in the world was
written on clay tablets or
carved into marble, its preservation
would be greatly simplified.
Even paper, when manufactured and
stored properly, can have a life
measured in hundreds of years.
Today, however, much of the
information being produced is
digital,[1] and
digital formats are notoriously fragile. Either
the media on which
the information is stored becomes unreadable, or
the hardware and
software needed to read the work becomes obsolete.
Think of that
old 8" floppy disk in the back of the drawer with your
attempt from
twenty years ago to write the Great American Novel (in
WordStar).
The magnetic data might not still be readable; drives that
can read
the disk are scarce; and few word processing packages today
can
understand WordStar documents.
To preserve analog
information resources, it is often sufficient to
house them in a
benign environment. In particularly bad cases, it
might be
necessary to make a microfilm or xerographic copy of the
original,
but copying is the exception rather than the rule. Digital
preservation, however, starts with copying. At a minimum, files
need
to be copied from obsolete or decaying media, such as 8"
floppy disks
or 5 " floppies, to current storage media. Good
preservation practice
requires much more, including making multiple
copies of files.
Digital documents may need to be changed from
WordStar to WordPerfect
to Word format, or perhaps even converted
to PDF or XML format. Every
time you use a digital file, you must
copy it. When digital documents
are displayed in a computer, they
are copied from the storage medium
into the RAM memoryof the
computer where it is then displayed.
Digital preservation and
access is all about copying.
In copyright law, copying is
known as "reproduction," and it's one
of the exclusive rights of
the copyright owner.[2] The right to publicly display a work
is also an
exclusive right of the copyright owner,[3] as is the right to make an
adaptation, known as a "derivative work."[4] Our desire to keep digital
information around for the future runs smack into the exclusive
rights
of the copyright owner.
Fortunately, while there
is no general exemption for preservation
activities in copyright
law, there are exemptions that can help
individuals and especially
libraries and archives legally preserve
expressive works for the
future. There are some specific exemptions
for certain types of
actions and for certain actors. Furthermore, in
the absence of a
specific exemption, one can always consider fair use
as a defense
when making a preservation copy.
Is it copyrighted?
Do you own the copyright?
Even before you start
looking for exemptions in the copyright law,
it is always a good
idea to check first to determine if an item really
is copyrighted.
Since there have not been any registration or notice
requirements
for copyright protection since 1989, most digital
information is
copyrighted as soon as it is created. But there are
exceptions.
Works created by the federal government are in the public
domain,[5] as are
facts,[6] ideas,[7] and expired works.
Digital copies of public domain works may themselves be in the
public
domain.[8]
You also don't need to worry about legal
restrictions on
preservation if you own the copyright in the work.
The copyright in
your draft of the Great American Novel most likely
belongs to you, and
you can do with it what you want. The same
goes for the digital
photographs you took on vacation last summer.
Let's assume, though, that what you are interested in
preserving is
copyrighted and that you do not own the copyright in
the work. What
then? There are at least three specific sections of
the copyright law
that may be of assistance.
Archiving Computer Programs: 17
USC § 117
If the digital file you are interested in saving is a computer
program, 17 USC § 117 of the United States copyright law can
help. This section states that in spite of the copyright owner's
exclusive rights, it is permissible for you to make a copy for
archival purposes of a copyrighted computer program. A computer
program is defined in the law as "a set of statements or
instructions
to be used directly or indirectly in a computer in
order to bring
about a certain result."[9] The law allows you to make a copy
of the WordStar
program (if you legally own it), and even adapt it
to run on your
Windows XP or Linux machine (if you can), but not
share the file with
anyone else. The section only applies to the
computer program itself.
It does not authorize the reproduction or
adaptation of documents
created with WordStar when the copyright in
those documents is owned
by someone other than you.
Libraries and Archives Making Preservation Copies: 17
USC § 108
Libraries and archives have additional preservation options under
17 USC § 108 of United States copyright law. One of the few
good things included in the Digital
Millenium Copyright Act ("DMCA") was a provision
that explicitly
allows libraries and archives to make up to three
copies of a work for
preservation purposes. Unlike the rest of the
provisions of Section
108, the items being preserved can be in any
format (text, images,
sound, etc.). Furthermore, the copies can be
digital, so long as they
are not distributed digitally nor made
available to the public in a
digital format outside the premises of
the library or archives.
In order to take advantage of
the exception, libraries and archives
must follow certain ground
rules. They must be either open to the
public or allow access to
non-affiliated researchers; the copying
cannot be for "direct or
indirect commercial advantage"; the library
or archives must own a
legal copy of the original item; and any copies
made must carry
with them a notice of copyright.[10] If the work is unpublished,
preservation copies
can be made for the purpose of preservation or
security.[11] If the
work is
published, preservation copies can be made to replace an
original
that is "damaged, deteriorating, lost, or stolen, or if the
existing format in which the work is stored has become obsolete."
The
law stipulates that a format is obsolete "if the machine or
device
necessary to render perceptible a work stored in that format
is no
longer manufactured or is no longer reasonably available in
the
commercial marketplace." The library or archives must also
conduct a
reasonable investigation to confirm that an unused copy
cannot be
obtained at a fair price. If digital copies are made,
access to the
digital version must be limited to the premises of
the library or
archives.[12]
Using Section 108, libraries and
archives can start preserving old
digital files in their
collections. It does not help them, however,
preserve materials
that they do not own, such as networked resources
or Web sites.
Nor does Section 108 help individuals who want to
preserve a
digital files they may have legally acquired or obtained
from the
Internet. For this sort of preservation, we must rely on fair
use.
Fair Use Preservation by Individuals and Libraries:
17
USC § 107
Since individuals cannot use Section 108 to make copies, even for
preservation purposes, they must turn to the Fair Use provision in
US
copyright law. Mary Minow provides a highly readable overview
of
fair use in "How I learned to love FAIR USE..."[13] At the heart of
the
fair use exemption is the assessment of the four factors that
constitute fair use: Purpose of the use, Nature of the
work, Amount or substantiality used, and Market impact
(PNAM). What might a fair use argument for digital
preservation look like?
It is likely that most
preservation copying would meet Minow's PNAM
test. As Robert
Oakley has noted about preservation copying in
general:
Virtually everyone views
preservation copying as socially beneficial. It is consistent with
the Constitutional purposes for copyright since the preservation of
printed knowledge is necessary for the progress of science and the
useful arts.[14]
If preservation is being done for
non-commercial, socially
beneficial reasons, it seems likely that
the "Purpose" factor
would lean towards fair use.
The nature of digital works, the second fair use factor, can vary
greatly, but Congress seems open to preserving a wide variety of
material when preservation is at stake.[15] The "Nature" factor,
then,
might also support a fair use.
The third factor,
the Amount and substantiality copied,
might normally weigh
against a finding of fair use, since the item is
being copied in
its entirety. But the Supreme Court has noted, "the
extent of
permissible copying varies with the purpose and character of
the
use."[16]
Obviously, if the purpose is to preserve a work, then the entire
work
must be copied. The amount copied is appropriate for the
purpose, and
so a court might even find this use fair.
The fourth factor, the Market impact of making of a
preservation copy, is likely to be the most important in any fair
use
assessment, and unfortunately it is almost impossible to guess
how a
court might rule on this. Would the courts conclude that
digital
information is like the computer programs protected by 17
USC §
117, which can be migrated and adapted to run on new
platforms without
compensation to the copyright owner? Or would
the courts conclude
that purchasing a copy of a work does not give
you the right to copy
it onto new media or transform it into new
formats into perpetuity?
Would they decide that individuals, like
libraries copying under 17
USC § 108, must first determine if
an unused copy can be
purchased before a preservation copy can be
made? Unfortunately, there
have been no cases involving digital
preservation that can serve as
indicators of how the courts might
rule.
As is always the case with fair use, you can't
really know if your
use is fair until a court determines if it is
fair. Nevertheless, when
considering the preservation problems of
motion pictures, the Senate
concluded that given the great danger
of loss, "making of duplicate
copies for purposes of archival
preservation certainly falls within
the scope of 'fair use.' "[17] We can hope that
the courts might accept a similar
argument for equally fragile
digital information.
Preserving the World Wide Web: A
Specific Case
As the World Wide Web has become an
ever-more important information
resource, there has been growing
interest in preserving parts of it.
Examples of Web site
preservation projects include: flooding in the
Red River Valley,[18]
the national
election in 2000,[19] the response to the events of 11 September 2001,[20] and the web pages
of
the Clinton White House.[21] The Internet Archive has sought to capture and
preserve a sizeable portion of the entire World Wide Web.[22] Other projects
have
sought to preserve the national Web of Sweden,[23] Denmark,[24] and the Nordic
Countries.[25] On
a more local
scale, many universities and other organizations are
beginning to
wonder how they might capture and preserve Web pages
associated with,
but not necessarily owned by, them.
Most information found on the Web is automatically copyrighted
when
created. The groups that want to preserve Web pages, however,
are
often not the copyright owners of those Web pages. We can
presume
that the copyright owner has granted an implied license to
allow
people to copy a Web page to a local machine and display it
there;
after all, if they did not want people to be able to read a
page
(which in the Web environment means making a temporary copy on
your
local machine), they would not have put the document up on the
Web.
But is there implied permission to copy and preserve Web
pages whose
copyright you do not own? If not, can such actions
qualify as a fair
use?
The most ambitious attempt to
preserve the Web, the Internet
Archive and its "Wayback Machine,"
allows you to retrieve outdated Web
pages from multiple points in
time. The Internet Archive has attempted
to bolster a possible fair
use defense in a number of ways. First, it
allows Web page
producers to "opt out" of the archives. It
does this by
offering instruction on how to use a "robots.txt" file to
prevent
its crawlers from retrieving new pages. Presence of a
"robots.txt"
file will also prevent Wayback Machine users from
accessing
previously harvested pages. In addition, under certain
conditions,
the Internet Archive will remove material from its
holdings.[26]
The
Internet Archive's willingness to respect the wishes of those
copyright owners who want to limit and control the reproduction of
their copyrighted works reduces the Archive's risk of infringement
suits. At the same time, it diminishes the utility of the archive
as
a whole by excluding important parts of the Web. For example,
eBay is
one of the most successful commerce initiatives in Internet
history.
The use policy for eBay stipulates that users must "not
use any robot,
spider, scraper or other automated means to access
the Site for any
purpose without our express written permission."
[27] Any Web
archiving
initiative that respects the terms of eBay's policy will
not capture
the eBay site, diminishing the value of the archive as
documentation
of the history of the Internet.
It is
unclear how much protection the Internet Archive's policies
really
provide. As one recent analysis concluded:
In short, the Internet
Archive largely ignores copyright law in the process of collecting
its
material, provides only a limited (and, arguably, effectively
valueless) protection for the material once stored, and in effect
disclaims any responsibility for what is done with the material by
the
end user, as well as any liability that the end user may incur
in
accessing the material. Given the litigious nature of the US,
it will
be interesting to see if the Internet Archive's success in
avoiding
litigation over its activities will continue for much
longer.[28]
Licensing and the DMCA: The Final Hurdles
Even
if you determine that 17 USC §§ 117, 108, or 107 can be
used
to legally preserve a digital resource, you may still not be out
of
the woods. Licensing contracts, for example, can override these
sections. You should carefully read any licensing agreements you
or
your institution has signed before making preservation copies
of
licensed resources.
Further, under the DMCA, if the
digital resource is protected by a
technology that controls access
to the resource, you cannot legally
bypass the access control
mechanism, even to preserve it.[29] On October 28,
2003, the Librarian of Congress
issued a rulemaking that rejected the
idea of a general
preservation exception to the terms of the DMCA.[30] Two of the four
exceptions
found in the rule making might be of some assistance in
preservation. For the next three years, users may legally bypass
access control mechanisms that rely on dongles that have
malfunctioned. They may also bypass access control mechanisms in
computer programs and video games distributed in formats that have
become obsolete and which require the original media or hardware as a
condition of access.
Conclusion
Good
preservation practice has often existed in a legal gray area.
Libraries usually made three copies when microfilming long before
the
law gave explicit permission for the practice, and many radio
programs
have been saved only because individuals systematically
taped them
from the air, without the permission of the copyright
owner.[31]
Digital
preservation resides in an even murkier legal gray area
because of the
fundamental need to copy digital information (one of
the exclusive
rights of the copyright owner) in order to preserve
it. In addition,
there is greater interest in preserving works
that you may not own,
particularly web pages. The lack of legal
certainty, however, should
not prevent individuals and libraries
from undertaking the socially
beneficial task of preserving digital
information. The law explicitly
authorizes some preservation
actions (especially if the materials are
not made digitally
available to others), and a strong fair use defense
can be built
outside the library or archives.
More Resources
In
addition to the items listed in the notes below, the following
publications provide useful information on the legal issues
associated
with preservation, particularly the preservation of
Internet
resources:
June
Besek, "Copyright
issues relevant to the creation of a digital
archive: a preliminary
assessment" (Washington, D.C.: Council on
Library and Information
Resources and the Library of Congress,
2003) <http://
www.clir.org/pubs/reports/pub112/contents.html>
Adrienne Muir,
"Copyright
and licensing for digital preservation" Library and
Information
Update, June 2003 <
http://www.cilip.org.uk/update/issues/jun03/article2june.html>
Dwayne
Buttler and
Kenneth Crews, "Copyright Protection and Technological
Reform of
Library Services" in Tomas Lipinski, ed., Libraries,
Museums,
and Archives: Legal Issues and Ethical Challenges in the
New
Information Era (Lanham, Maryland: Scarecrow Press, 2002)
Acknowledgements
Mary Minow and Nancy McGovern
provided invaluable advice during the
preparation of this
article.
[4] 17 U.S.C. §
106(2). It could be argued that converting a document from
WordStar to Word makes a derivative of the original work.
[14] Robert Oakley,
Copyright and Preservation: A
Serious Problem in Need of a Thoughtful
Solution (Washington,
D.C.: Council on Library and Information
Resources, September 1990)
<http://www.clir.org/pubs/abstract/pub11.html>.
[15] For example,
17 USC § 108
exemptions do not apply to "a musical work, a
pictorial, graphic or
sculptural work, or a motion picture or other
audiovisual work
other than an audiovisual work dealing with news"
unless the
material is being preserved, at which point the
nature and format
of the material ceases to matter.
[17] House Report
on the new copyright law, H.R. Rep.
No. 94-1476, quoted in U.S.
Copyright Office, Reproductions of
Copyrighted Works by Educators and
Librarians (Washington, D.C,
1995): p. 10 <http://www.copyright.
gov/circs/circ21.pdf>
[18] Mark A.
Greene,
"Floods, Flashsite, and Fond Delusions: Archiving the Web,"
Spectra, the Journal of the Museum Computer Network 25:3 (Spring
1998), 10-14.
[30] Rulemaking on
Exemptions from
Prohibition on Circumvention of Technological Measures
that Control
Access to Copyrighted Works, 28 October 2003 <http://www.copyright.gov/1201/>
[31] Erik Smith,
"Around the Dial Web Battle Is Latest
Episode in Old-Time Radio
Serials One firm is carrying on the fight
to enforce copyrights, much
to the dismay of collectors and
Internet users." Los Angeles Times,
2/16/2001.
Â
Peter
B. Hirtle is Director for Instruction and Learning in the
Instruction,
Research, and Information Services Division of Cornell
University
Library. Hirtle also serves as the Intellectual
Property Officer for
the Cornell University Library. Previously
while at Cornell, Hirtle
served as Director of the Cornell
Institute for Digital Collections
where he explored the use of
emerging technologies to expand access to
cultural and scientific
sources through the development and management
of distinctive
digital collections. He also served as the Associate
Editor of
D-Lib Magazine , a monthly magazine
about
innovation and research in digital libraries.
This work is licensed under a
Creative
Commons License.
|