I. EXECUTIVE SUMMARY
In the 1991 Feist Publications v. Rural
Telephone Service Corp case, the U.S. Supreme Court ended
the "sweat of the brow" doctrine that had conferred
some degree of copyright protection on non-creative compilations
of information. The Feist decision has produced subsequent
case law in which databases resulting from a substantial investment
have been taken by others to produce competing products; lack
of copyright leaves the database maker with no recourse against
third party predators -- except what state misappropriation law
might offer -- and only limited recourse, based on contract law,
against contracting parties.
Beginning in the late 1980's, Member States
of the European Union (EU) sought to harmonize the copyright laws
of their various legal systems. That effort resulted in an awareness
that some EU States -- Ireland, the U.K., the Netherlands, and
the Nordic countries -- provided greater protection to non-creative
compilations than other Member States. Eventually, efforts to
harmonize the EU copyright laws for the TRIPS Agreement left the
EU without any intellectual property protection for non-creative
compilations of data. After considering varied proposals, in March
1996 the EU adopted a Database Directive requiring all Member
States to provide a sui generis form of intellectual property
protection for databases.
The EU Database Directive became the basis
for the EU's proposal for a draft international treaty that was
submitted to the World Intellectual Property Organization (WIPO).
In anticipation of a WIPO Diplomatic Conference in December 1996,
and because of substantial concerns about provisions of the EU
proposal, the U.S. submitted its own proposal to WIPO. Ultimately,
the 1996 Diplomatic Conference focused on copyright and neighboring
rights; database protection was left unaddressed. Nonetheless,
WIPO established a timetable to resume discussions on database
protection in 1998.
In the United States, a proposal for sui
generis protection was introduced in the House in 1996 by
then-Congressman Carlos Moorhead. That proposal generated considerable
opposition from the scientific, education, and library communities.
In the 105th Congress, Howard Coble, Chairman of the House Subcommittee
on Courts and Intellectual Property, introduced H.R. 2652, which
would provide a database maker with protection against misappropriation
of any substantial part of its database, where such misappropriation
would harm the actual or potential market for the database. In
hearing in late 1997 and early 1998, scientists and educators
-- as well as telecommunications companies -- expressed significant
concerns over many aspects of the bill. Nonetheless, on May 19,
H.R. 2652 passed the House on voice vote from the suspension calendar.
Recently, a corresponding bill was introduced in the Senate (S.
2291); as of the time of this memorandum, S. 2291 was co-sponsored
by Senators Grams, Cochran, Faircloth, and Helms.
In an effort to help policy makers understand
the concerns of all parties, the Patent and Trademark Office ("PTO")
held a one day conference on database protection and access issues
("PTO Database Conference") on April 28, 1998. At that
time, H.R. 2652 had only been approved by the House Judiciary
Committee. The conference was held at the Brookings Institution
and attracted over 175 attendees representing academia, the business
community, libraries, government, non-profits,
and the scientific community.
The conference did not -- and was not expected
to -- produce consensus on any issues, including the most fundamental
issue of whether or not database protection is needed. We believe,
however, that the proceedings helped (a) initiate dialog, then
and subsequently, between various parties, and (b) helped identify
areas where disparate interests may be accommodated through further
legislative developments. After reviewing the conference proceedings,
we believe that the Administration should be willing to support
database protection legislation that meets five widely-supported
principles:
1. A change in the law to protect commercial
database developers from Warren Publishing-like situations
is desirable.
2. Consistent with Administration policies,
databases generated with Government funding should not be placed,
de jure or de facto, under exclusive control of
private parties.
3. Any database protection regime must carefully
define and describe databases and prohibited acts, so as to avoid
unintended consequences, including undue disruption of existing
business relationships and non-profit research.
4. Any database protection regime should
be subject to exceptions largely co-extensive with "fair
use" principles of copyright law.
5. Consistent with U.S. trade policy, it
is desirable to secure for U.S. companies the benefit of the EU
Database Directive and laws in other countries protecting database
products.
This document provides a brief summary of
the April 28 conference; our analysis of these principles both
generally and as they relate to H.R. 2652; and a few areas where
we believe further work may be needed to be produce an acceptable
legal regime for databases.
II. THE PTO DATABASE ISSUES CONFERENCE
The PTO Database Conference was held on
April 28, 1998 at the Brookings Institution in Washington. Preparations
for the conference began in late January. The format of the conference
was a series of plenary sessions with mid-day "breakout"
sessions devoted to more specialized topics.
In planning conference topics and possible
panelists, we reviewed all testimony given before the House Subcommittee
on Intellectual Property in its hearings in October 1997 and February
1998. We also met with representatives of the Information Industry
Association (IIA), the Information Technology Association of America
(ITAA), and the National Research Council (NRC). We had on-going
discussions with representatives of these organizations as well
as conversations with the American Library Association (ALA),
the Association of American Publishers (AAP), the Association
of Research Libraries (ARA), and the Business Software Alliance
(BSA).
The 23 panelists and moderators consisted
of 18 Americans and 5 Europeans. These included seven legal or
economic academics (divided roughly equally between supporters
and critics of database protection proposals); six scientists
and representatives of scientific organizations; two library representatives;
and five business groups. The conference panelists/moderators
also included representatives of the State Department, the Copyright
Office, and the European Commission. Approximately 175 people
attended one or more sessions of the conference. In addition to
many people from trade associations and Washington law firms,
participants included:
from the scientific community and related government agencies,
representatives of the Centers for Disease Control, Chemical Abstracts
Service, the House Science Committee, the National Science Foundation,
the National Research Council, OSTP, the State Department, and
the U.S. Geological Survey; from
the private sector, representatives of ABC Cable and News Media,
BellSouth, Dun & Bradstreet, Eli Lilly, Fujitsu, IBM, Intermetrics,
Lexis-Nexis, McGraw-Hill, Reuters, MCI and several smaller businesses,
including information firms for realtors and insurance companies;
from non-profit organizations, representatives
of the Modern Language Association, the Church of Jesus Christ
of Latter Day Saints, and National Public Radio.
The conference had four plenary sessions
with seven mid-day "breakout" discussion groups. Plenary
session topics reflected neutral statements of general issues
that have arisen repeatedly in Congressional hearings and scholarly
writings on database protection and access issues; several "breakout"
sessions were dedicated to thorny issues identified by specific
groups. The first plenary panel discussed whether there is need
for additional database protection; the second plenary was devoted
to the concerns of the scientific and research communities; and
the third plenary session explored the "fair use" needs
of libraries, non-profit entities, and database producers who
rely on government data. The fourth plenary session consisted
of reports to the Assistant Secretary of discussions in the mid-day
break-out sessions. Attachment A is a program of plenary topics
and breakout sessions from the conference. Individuals interested
in obtaining copies of videotapes of the first and fourth plenary
sessions (the only sessions recorded) may do so for the cost of
reproduction by contacting Justin Hughes, Office of Legislative
and International Affairs, Patent and Trademark Office, Department
of Commerce, Washington, D.C. 20231, justin.hughes@uspto.gov.
III. EMERGING PRINCIPLES AND ISSUES
A. BASIC PRINCIPLES
In light of the conference proceedings and
after reviewing Congressional testimony, scholarly writings, and
reports on these issues, we believe that a set of principles emerge
that should shape the administration's position on database protection.
This principles could be embodied in any number of approaches,
including H.R. 2652 with appropriate modifications to reflect
these goals. After listing the principles, we discuss each principle
and analyze how H.R. 2652 fulfills or fails to achieve those goals.
1. A change in the law to protect commercial
database developers from Warren Publishing-like
situations is desirable.
2. Consistent with Administration policies,
databases generated with Government funding should not be placed,
de jure or de facto, under exclusive control
of private parties.
3. Any database protection regime must
carefully define and describe databases and prohibited acts, so
as to avoid unintended consequences, including undue disruption
of existing business relationships and non-profit research.
4. Any database protection regime should
be subject to exceptions largely co-extensive with "fair
use" principles of copyright law.
5. Consistent with U.S. trade policy,
it is desirable to secure for U.S. companies the benefit of the
EU Database Directive and laws in other countries protecting database
products.
The discussion which follows elaborates
on each of these principles.
1. A change in the law to protect commercial
database developers from Warren Publishing-like
situations is desirable.
There was considerable, albeit not complete,
consensus at the conference that some type of legislative "fix"
would be reasonable to provide commercial database producers with
protection for their products. This has been stated by leading
scientists and by legal scholars identified as critics of database
protection. A handful of people remain who insist that case law
will develop and/or that a combination of technology and contract,
copyright, and trade secrecy law offer database producers sufficient
incentive. But on the whole, there seems to be agreement that
situations like Warren Publishing and the ProCD
case are likely to arise in digital commerce and that some protection
in such situations is desirable. A recent report by the Japan
Institute of Intellectual Property reaches the same conclusion:
"In today's society the database industry has proved to be
of vital support for governmental, educational, and commercial
purposes. Since databases are plainly open to full-scale misappropriation
a lack of adequate legal protection obviously could have a range
of damaging effects on the everyday life of society. . . . Once
disclosed to the public, information can be used generally speaking
and leaving aside contractual or tortious liability, freely without
the database provider's permission or an obligation to reimburse
him for his investment. This holds equally true for the off-line
as well as the on-line market."
While the NRC principally advocates scientists'
concerns and believes science has a specific, public-minded paradigm
for data-gathering, in their seminal study of database issues,
Bits of Power, they recognized the problem existing in
the commercial sector:
"In the private sector, by contrast, commercial compilers
of data have long suffered from a risk of market failure owing
to the intangible, ubiquitous, and above all, invisible nature
of information goods and the ease with which free riders may have
appropriated the fruits of the compilers' investment once the
information goods were made available to the public in print media."
These sorts of cases are only likely to
increase with digital media. The ProCD v. Zeidenberg case
provides an example of a fact pattern that may become commonplace
without appropriate legal safeguards. In ProCD, defendant
Matthew Zeidenberg purchased ProCD's CD-ROM database of 3,000
telephone directories from around the country. He then formed
a company to sell the telephone directory information online --
for far less than the price for the CD-ROM set. ProCD prevailed
in this case at the appellate level because the Seventh Circuit
panel ruled that the "shrink-wrap" license which limited
the defendant to non-commercial use of the CD-ROMs was enforceable.
In a case where Zeidenberg gave the CD-ROM set to someone
else, who later started the same company, ProCD would have had
no privity of contract against the defendant company and would
have lost control of its database. Similarly, in Warren Publishing
v. Microdos Data Inc., Warren Publishing's "Directory
of Cable System" classified cable television systems classified
by the principal communities they served. The directory was apparently
taken and reproduced by Microdos Data in a competitor product
sold in software format. The Eleventh Circuit, sitting en banc,
ruled that there were no copyrightable aspects to Warren Publishing's
database that had been taken by the defendant.
The database protection regime set out in
H.R. 2652 would clearly meet the goal of addressing these situations.
At the same time, this goal could probably be met with a modified
"NBA v. Motorola" approach (as amended by suggestions
of Professors Ginsburg and Reichman) built on the elements of
a misappropriation claim being: (i) the plaintiff generates or
collects information at some expense, (ii) the defendant's use
of the information constitutes free-riding on the plaintiff's
costly efforts to generate or collect it, (iii) the defendant's
use of the information is in competition with a product or service
offered by the plaintiff or likely to be offered by the plaintiff,
and (iv) the ability of other parties to free ride on the efforts
of the plaintiff would so reduce the incentive to produce the
product or service that the existence or quality of the product
would be substantially threatened. At the same time, we think
that these are largely the principles that govern H.R. 2652. Where
H.R. 2652 diverges from an NBA v. Motorola model, there
may be good reasons.
Some participants at the conference also
raised concerns about the constitutionality of different database
protection proposals. We believe that there are two principal
concerns. The first is whether the Supreme Court's interpretation
of the Intellectual Property Clause (Article I, Section 8, Clause
8 ) as set forth in Feist pre-empts Congressional exercise
of Commerce Clause power to legislate in this area under the doctrine
of Railway Labor Executives' Ass'n v. Gibbons ("Clause
8 pre-emption"). Given Congress's creation of discrete intellectual
property rights in areas previously treated as related to copyright
or patent (trademark, semiconductor mask protection) and the Supreme
Court's continued recognition of "non-copyright grounds"
for protection of information, we believe that a database protection
bill can be properly crafted to avoid Clause 8 pre-emption.
The second concern is what limits the First
Amendment imposes on any database protection regime. This is not
a new problem; courts have frequently dealt with the relationship
between trademark law and the First Amendment, copyright law and
the First Amendment, and trade secrecy law and the First Amendment.
All of these laws limit "speech" in which citizens may
engage but remain, nonetheless, compatible with the First Amendment.
We believe that First Amendment concerns can be addressed as long
as any database protection regime (a) permits unhampered independent
collection of information, (b) permits use of data for criticism,
news reporting, and de minimis personal communications,
and (c) recognizes a wide berth of "fair" uses that
do not substantially affect the commercial activities of the database
owner. We understand that the Department of Justice's Office of
Legal Counsel is in the process of preparing a preliminary analysis
of constitutionality issues concerning H.R. 2652; we look forward
to reviewing this preliminary analysis.
2. Consistent with Administration policies,
databases generated with Government funding should not be placed,
de jure or de facto, under exclusive control
of private parties.
There seems to be general agreement that compilations of data generated with U.S. Government funding should not be subject to any protection regime. There are several reasons for this. First, if U.S. Government-funded databases were subject to some type of protection regime, taxpayers might "pay twice" for access to data. Second, the principal argument for a protection regime is that, absent such protection, private parties will lack adequate incentives for database production. But government funding provides the incentive in the case of publicly-financed compilations, such as weather information, census data, and medical studies funded by NIH grants. As the Office of Management and Budget has stated:
"Government information is a valuable national resource.
It provides the public with knowledge of the government, society,
and economy -- past, present, and future. It is a means to ensure
the accountability of government, to manage the government's operations,
to maintain the healthy performance of the economy, and is itself
a commodity in the marketplace."
For many government agencies, the responsibility
to make government-generated information widely available is a
statutory obligation. For example, the Agriculture Department
works under a wide directive to "diffuse among people of
the United States, useful information on subjects connected with
agriculture . . " (7 U.S.C. section 2201) while statutes
such as the Freedom of Information Act and the Government in the
Sunshine Act "establish a broad and general obligation on
the part of Federal agencies to make government information available
to the public and to avoid erecting barriers that impede public
access."
a. A Wide Definition of Government Data
While there is wide agreement on this general
proposition, some questions have been raised whether data generated
by the government (for example, from government-owned satellites)
is distinct from data generated by non-government entities funded
by the government (for example, private researchers working with
NIH grants). We believe that even if it were desirable to draw
a distinction of this sort, no statutory language could
adequately capture this distinction, particularly in a time when
efforts to "reinvent" government may lead to private
parties gathering datasets under government contracts that might
have been gathered previously by government employees. For example,
many private contractors participate in gathering data for the
decennial Census; the individuals who work for these private entities
are sworn in as "special census employees" only for
purposes of statutory confidentiality requirements and are not
federal employees under Title 5 of the U.S. Code.
H.R. 2652 presently addresses this issue
with the following broad section 1204(a) exclusion:
"Protection under this chapter shall not extend to collections
of information gathered, organized, or maintained by or for a
government entity, whether Federal, State, or local, including
any employee or agent of such entity, or any person exclusively
licensed by such entity, within the scope of the employment, agency,
or license. Nothing in this subsection shall preclude protection
under this chapter for information gathered, organized, or maintained
by such an agent or licensee that is not within the scope of such
agency or license, or by a Federal or State educational institution
in the course of engaging in education or scholarship."
We believe that this provision serves the
general policy goal of making all forms of government information
available to the public, but we believe that the language can
be improved. In response to concerns raised as the "publicly-funded
data" breakout session about the different government contractual
arrangements with laboratories and private companies, we suggest
that the drafters of H.R. 2652 should examine existing definitions
of "government information" for descriptions that capture
a fuller range of government-sponsored data collection. For example,
OMB Circular A-130 states that "the definition of 'government
information' includes information created, collected, processed,
disseminated, or disposed of both by and for the Federal Government."
At a minimum, we are concerned that the
present language does not adequately cover situations in which
the government contracts for information gathering. It was pointed
out at the conference that government contracts sometimes expressly
preclude the private entity from being an "agent" or
"licensee" of the government -- thus removing their
activities from the ambit of section 1204(a) as presently written.
One way to address this would be inclusion of "contracting
for the government" language. Another possibility would be
inclusion of statutory language that the 1204(a) exclusion also
applies to data gathering "funded by the government"
in section 1204(a) and discussion in the legislative history to
make it clear that section 1204(a) applies to databases developed
by a private entity as a necessary part of a government-funded
contract, whether or not "gather[ing], organiz[ing], or maintain[ing]"
a collection of information was the purpose of the government
contract. For example, if a company working on airport safety
under contract from the FAA builds a database of airport characteristics
that is required to complete its contract with the FAA, then the
company should not be able to assert any exclusionary rights over
the airport database. It may be possible to develop standards
for when a database is necessary for a government contract from
existing standards for when government agencies must collect data.
In any case, the same rationales apply to government contracting
as to data generated by the government itself: government funding
already provides an adequate incentive and there is no reason
taxpayers should pay 'twice' for data gathering.
The distinction which need to be drawn is
between (a) compilations of data made as a necessary element of
a Government-funded activity, and (b) compilations of data made
by private entities over and above the activity being funded.
This appears to be the intent of the section 1204(a) language
that:
"Nothing in this subsection shall preclude protection under
this chapter for information gathered, organized, or maintained
by [a government] agent or licensee that is not within the scope
of such agency or license . . ."
This appears to protect other activities
of a government licensee and to permit protection of value-added
databases that the licensee generates from government data. Nonetheless,
we think that this section could be clarified by express language
(or discussion in the legislative history) that transformative
developments from government compilations of data can be protected,
i.e. that value-added activities outside the ambit of a government
contract can produce protected databases, subject to the general
principle -- drawn from copyright law -- that where government-funded
data and value-added data are commingled and the government-funded
data predominates, then the private data producer should take
affirmative steps to distinguish the two types of information.
b. State and Local Government Data
Another minor question in this area has
been whether data generated from funding by state or local governments
should be treated differently than data generated from funding
by the Federal Government. The above language takes the position
that data generated with funding from any level of government,
federal, state, or local, may not take advantage of the H.R. 2652
database protection regime. The Committee Report notes that this
"exclusion is broader than the similar provision in section
105 of the Copyright Act" in that it applies to state and
local governments. This raises some interesting questions.
Given the rationale that taxpayers should
not "pay" for databases twice, this does create the
possibility that, for example, a database whose creation was funded
by the California state government will be used by private citizens
of Arizona -- giving the Arizonans a free-ride on the California
taxpayers' investment. Nonetheless, we agree with the Committee's
approach because of the importance of developing a strong, clear
principle that government-generated data is not subject to exclusion.
c. University Generated Databases
Section 1204(a) is currently worded to ensure
that data gathered by state-funded colleges and universities may
enjoy 2652 protection:
"Nothing in this subsection shall preclude protection under
this chapter for information gathered, organized, or maintained
by such an agent or licensee that is not within the scope
of such agency or license, or by a Federal or State educational
institution in the course of engaging in education or scholarship."
According to the Committee Report, "educational
institutions that happen to be government owned should not be
disadvantaged relative to private institutions when producing
databases unrelated to the provision of regulatory government
functions."
This is a topic where guiding principles
may conflict. What happens with a database gathered by medical
researchers at a state university working under a federal grant
from NIH? Should this be excluded from 2652 protection on the
ground that it is government-funded research (and data for which
the American public has already paid)? Or should the database
be eligible for 2652 protection on the grounds that it comes from
"a Federal or State educational institution in the course
of engaging in education or scholarship" and the principle
that state-funded schools should not be prejudiced against private
universities?
Administration policies clearly establish
that the U.S. Government has a right to disseminate data produced
by any federal grant to institutions of higher education, hospitals,
and non-profit research organizations. OMB Circular A-110 states
the general framework, including the U.S. Government's right to
a "royalty-free, non-exclusive and irrevocable" license
to any copyright and, concerning compilations of information:
(c) Unless waived by the Federal awarding
agency, the Federal Government has the right to (1) and (2):
(1) Obtain, reproduce, publish or otherwise
use the data first produced under an award.
(2) Authorize others to receive, reproduce,
publish, or otherwise use such data for Federal purposes.
In keeping with this policy and our belief
that Government-funded data should not be subject to 2652 protection,
we believe that databases resulting from research directly funded
by the government, whether generated by a for-profit entity or
a non-profit entity, should be ineligible for 2652 protection.
No distinction should be drawn between the research being funded
at Sloan-Kettering, Harvard, Michigan State, or a Kaiser Permanente
hospital as long as the research is directly funded by the government.
On the other hand, we think that a professor working at a state
university without any government grant beyond her state university
salary and laboratory funds should be able to apply 2652 protection
to a database resulting from her work. This would address what
might otherwise be an inequitable situation between private institutions
like Amherst College and USC versus state institutions like the
University of Massachusetts at Amherst and UCLA. It may be difficult
to craft statutory language that absolutely resolves this problem,
but we believe this should be thoroughly addressed in the legislative
history. The issue can and should be expressly addressed in government
grants.
d. Realistic Government Action in a H.R. 2652 Environment
All parties should recognize that §
1204(a), whether as currently worded or amended along the lines
suggested, will require diligence on the part of government contracting
agents to ensure that delivery of data (in a reasonable form)
to the public is part of the described government-funded activity.
Otherwise, licensees could argue that the form in which they were
making the data available to the public was a value-added format
and "outside" the scope of their government contract.
We think that any future legislative report should be clarified
on this count: that when the government contracts with a private
firm to produce data, usually the goal is to not only produce
data, but also to make that data reasonably available to the relevant
public in at least raw form.
At the same time, we think that the discussion
about database protection and the need to keep government-generated
data in the public domain has ignored one fact: that the U.S.
Government has already undertaken some programs intended to generate
scientific data and not place it in the public domain.
For example, the Sea-viewing Wide Field-of-view Sensor ("SeaWiFS")
is a "cost-sharing collaboration" between NASA and Orbital
Sciences Corporation (OSC) "wherein NASA's Goddard Space
Flight Center . . . specified the data attributes and bought the
research rights to these data" while "OSC provided the
spacecraft, instrument, and launch" and retains "the
operational and commercial rights to these data." The Space
Commercialization Act is a broader example of government/private
sector collaboration in which the government partially funds research
efforts conscious that the resulting data will be commercialized.
Federal agencies are under direction to ensure that "information
systems do not unnecessarily duplicate information systems available
. . . from the private sector. For example, NOAA buys substantial
amounts of data from private entities and negotiates the terms
for data usage in such buys. How would §1204(a) relate to
these efforts?
There are two possible, alternative answers.
First, it would be credible to take the position that while the
Government may engage in collaborative programs with private entities,
both the Government and the private entities do so without the
benefit of any database protection law, i.e. the results of its
collaborative projects with private industry can be protected
by any of the means now available -- technological means of controlling
access, contract law, etc -- but not by the new law. This would
suggest that the first sentence of §1204(a) should be written
to "govern" all public/private joint ventures. The second
alternative is to say that the second sentence of §1204(a)
governs: depending on how the government/private entity contract
is crafted, certain uses of data can be outside the government
license, contract, or agency, such that a private company like
OSC can enjoy database protection rights. Given the existence
of the Space Commercialization Act, we think that a final resolution
among these two alternatives is a broader question than H.R. 2652.
Our hope is that H.R. 2652 will be compatible with either view.
3. Any database protection regime must
carefully define and describe databases and prohibited acts, so
as to avoid unintended consequences, including undue disruption
of existing business relationships and non-profit research.
Defining a database or "compilation
of information" is one of the most daunting tasks in drafting
any database protection or access law. We believe that a database
protection law should exclude the following from the ambit of
protection: (a) audio-visual works, despite the fact that they
are arguably "compilations" of film frames; (b) narrative
texts, whether fiction or non-fiction, regardless of length, despite
these being "compilations" of words; and (c) pieces
of music, whether in sheet music or recorded performance form,
despite these being "compilations" of chords, lyrics,
musical notes, etc. We are also unsure that the present bill adequately
addresses concerns about datasets embedded in the nation's telecommunications
infrastructure.
This challenge of defining "compilations
of information" is one area where we believe there is room
for improvement of H.R. 2652, either in the statutory language
or in legislative history which can clarify Congress' intent.
At present, H.R. 2652 defines a compilation of data as follows:
"1201" As used in this chapter:
"(1) Collection of information. -- The term 'collection of information' means information that has been collected and has been organized for the purpose of bringing discrete items of information together in one place or through one source so that users may access them
"(2) Information. -- The term 'information' means facts,
data, works of authorship, or any other intangible material capable
of being collected and organized in a systematic way."
And provides the following legislative report
on the subsection:
"Section 1201 . . . defines 'collection of information'
. . . . The definition is intended to avoid sweeping too broadly,
particularly in the digital environment, where all types of material
when in digital form could be viewed as collections of information.
It makes clear that the statute protects what has been traditionally
thought of as a database, involving a collection made by gathering
together multiple discrete items with the purpose of forming a
body of material that consumers can use as a resource in order
to obtain the items themselves. This is in contrast to elements
of information combined and ordered in a logical progression or
other meaningful way in order to tell a story, communicate a message,
represent something, or achieve a result. Thus, a novel would
not be considered a 'collection of information' even if it appears
in electronic form, and therefore could be described as made up
of elements of information that have been put together in some
logical way. Similarly, materials such as interface specifications
would not ordinarily be covered, although a collection of such
specifications created in order to provide consumers access to
the individual specifications could be covered."
In terms of the general definition, we think
that this present language takes a viable approach, but that it
can be improved.
For example, the EU Directive differs from
the present definition in H.R. 2652 in requiring that the information
be "arranged in a systematic or methodical way and individually
accessible by electronic or other means." [Article 1(2)]
The problem with the EU definition is that single frames of films
and specific parts of songs are already "individually accessible"
and will become more so with increasing digitization; we think
that the true difference between a database and, on the other
hand, a film or song is that the elements of a database are intended
to be accessed individually. They are also intended
to be accessed in sets and subsets, as when one uses a column
of information in a spreadsheet database. This suggests definition
of a compilation based on the intention that elements be accessed
in a particular way: a database is information collected for
the purpose of allowing users to access items of information both
individually and in sets or subsets of related items of information.
We understand that this may have been the intent of the 1201(1)
language that a collection of information is "information
that has been collected . . . for the purpose of bringing discrete
items of information together in one place or through one source
so that users may access them" but the language could more
clearly convey this intention by shifting where the "purpose"
is located and introducing the notion of accessing data individually
or in sets, i.e. a collection is "information that has been
collected . . . in one place or through one source for the purpose
of allowing users to access items of information both individually
and in sets or subsets of related items of information."
We believe, however, that no abstract definition
of a database will give us a bright line border between databases
and non-database works. Therefore, we think that clear legislative
history on this question is especially important. For example,
where the current legislative report gives the example that "a
novel would not be considered a 'collection of information' even
if it appears in electronic form . . . " we think that the
legislative history should enumerate several examples of work
with a "logical" or "linear" progression (or
a representational nature) that are not intended to be protected
as databases: audio-visual works, video games, computer software
code, fictional narrative texts, non-fictional narrative texts,
and photographs. We think that the single example of a fiction
novel in the present legislative report is especially troublesome
because it does not sufficiently clarify the important point that
a non-fiction narrative text should also fail to qualify as a
database.
A second area of concern with the current
definition of a database relates to computers and the Internet.
The statute expressly states in §1204(b)(2) that "[a]
collection of information that is otherwise subject to protection
under this chapter is not disqualified from such protection solely
because it is incorporated into a computer program." Read
by itself, this strongly suggests that all databases in computer
programs are protected. Many such embedded databases are not intended
for human perception; we believe that these databases should be
protected on a "sweat of the brow" justification to
avoid situations in the future in which competitors steal significant
unprotected value-added from software makers. This appears to
be something the House Subcommittee did not fully consider. (There
was no testimony before the Subcommittee on this subject during
its two days of hearings.)
While we believe that protection should
be afforded to datasets built into software and made through substantial
investments, regardless of whether they are "accessed"
by humans or not, there seems to be some equivocation on the bill
and its legislative report. First, the definition of a "collection
of information" in §1201(1) speaks of information arranged
"so that users may access them." On the one hand, this
ambiguous term coupled with the software inclusion provision of
§1204(b)(2) would suggest that non-human "users"
might qualify. On the other hand, the legislative report states:
". . . material such as interface specifications would not
ordinarily be covered, although a collection of such specifications
created in order to provide consumers access to the individual
specifications could be covered." [discussion
of §1201]
The use of "consumers" in this
phrase suggests a human-use standard is intended, but this is
not clear. We agree that the "interface specification"
problem should be resolved as the legislative report states, but
we also believe that the software-embedded database problem apparent
in the Gates Rubber opinion should be resolved favorably
for the parties investing in these databases; this would, at a
minimum, suggest different language in the legislative history.
4. Any database protection regime should
be subject to exceptions largely co-extensive with "fair
use" principles of copyright law.
There seems to be general agreement that
any database protection regime should be subject to exceptions
with approximately the same scope as copyright "fair use."
Some critics would call for exceptions with at least the
same scope as "fair use." The most significant detractors
from this view are those who argue that such discussions of fair
use demonstrate that any database protection regime is actually
a copyright law -- lurking under a different label and forbidden
by the Supreme Court's ruling in Feist.
A. The 1203(d) Exception
H.R. 2652 does not provide exceptions from
liability parallel to those in the copyright law. Some take the
position that the bill's exceptions are not as broad as copyright
fair use; some argue that the bill gives broader exceptions. We
think this issue merits further attention. The main exception
from liability provided by the bill is § 1203(d) which provides
as follows:
"(d) Nonprofit Educational, Scientific, or Research Uses.
-- Nothing in the chapter shall restrict any person from extracting
or using information for nonprofit educational, scientific, or
research purposes in a manner that does not harm the actual or
potential market for the product or service referred to in section
1202."
We agree that this language does not solve
the problem of databases actually developed for scientists or
researchers. At least one representative of the scientific community
at the April 28 conference has further criticized this proposal
as "illusory"; we understand this criticism, i.e. that
§ 1203(d) really adds nothing to § 1202. But we believe
that the § 1203(d) language recognizes a wide range of exceptions.
For example, § 1203(d) would permit the following research
uses:
+ A statistician uses lists from the AMA's
directory of physicians and the Martindale-Hubbell directory of
attorneys to do a statistical analysis of the distribution of
recently graduated medical specialists correlated to different
legal specialties, particularly personal injury lawyers, among
major metropolitan areas;
+ A sociologist reproduces some of Warren
Publishing's list of cable operators in a book on the effects
of mass media in America;
+ A statistician and an economist reprint
sections of Phillips Business Information's Electronic Commerce
Directory and Canadian Electronic Commerce Directory
in their comprehensive study of e-commerce developments in NAFTA
countries;
+ A biologist specializing in mammalian
metabolism integrates drug testing data from a study done and
publicized by a pharmaceutical company (to promote the efficacy
of its drug) in her scholarly analysis of mammal reactions to
certain chemical compounds.
+ A medical researcher uses grocery shopping
data generated from checkout scanning equipment in supermarkets
(which is marketed back to supermarkets and to food companies)
to study the possible effects of consumption patterns on cancer
rates.
One concern is that businesses will try
to define their "actual" and "potential" market
broadly to include these research uses, either in litigation claims
or (for the far-sighted party) in their business plan for any
new database. We think this is a possibility, but not a great
danger. As with any legislation, some private parties will try
to manipulate their behavior to gain undue advantage from statutory
language and courts must curb such activities. We think that this
concern about harm to a "potential" market for a database
can be addressed through some improvement of § 1203(d) discussed
below.
Given the amount of discussion at the conference
on fair use, it is worthwhile to directly compare how § 1203(d)
and other exculpatory provisions of H.R. 2652 would work in comparison
to copyright's principal fair use provision, 17 U.S.C. §
107. Section 107 states that "fair use" is the use of
copies "for purposes such as criticism, comment, news reporting,
teaching (including multiple copies for classroom use), scholarship,
or research . . . ." But not all uses in these categories
are fair uses; instead a court must consider four factors:
"(1) the purpose and character of the use, including whether such use is of commercial nature or is for nonprofit educational purposes;
"(2) the nature of the copyrighted work;
"(3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
"(4) the effect of the use upon the potential market for
or value of the copyrighted work."
Initially, it should be noted that §
1203(d) of H.R. 2652 offers a stronger exception than 17 U.S.C.
§107 because §1203(d) is absolute -- if a party falls
into its description, the exception applies. In contrast, 17 U.S.C.
§107 requires a court to weigh the four factors, so a use
that falls in the "teaching" or "research"
description may still infringe.
Of the four fair use factors, §1203(d)
already addresses "(1)" by stating that the present
exclusion applies to "nonprofit educational, scientific,
or research purposes". There may be some question whether
the word "nonprofit" modifies only "educational"
or modifies all three adjectives "educational, scientific,
or research" The legislative report sheds limited light on
this point, particularly because it uses the same grammatical
construction twice. The report does say that §1203(d) is
intended to "alleviate concerns expressed by members of the
research, scientific, and university communities"; since
none of those concerns have been expressed by for-profit researchers,
we take §1203(d) to refer to nonprofit activities, whether
educational, scientific, or research. We think that for-profit
research, as in research laboratories at companies like Amgen,
IBM, or Ford, would fall outside the ambit of §1203(d).
Opponents of the legislation have criticized
H.R. 2652 for not including the second and third of the four §
107 fair use factors:
"(2) the nature of the copyrighted work;
"(3) the amount and substantiality of the portion used in
relation to the copyrighted work as a whole
Supporters of the bill have responded that
these criteria are already built into the H.R. 2652 framework
and do not need to be restated in the exception(s).
As concerns factor (2) of §107 -- which
calls for consideration of the "nature of the copyrighted
work" -- proponents of H.R. 2652 argue that since it only
covers databases, a court enforcing H.R. 2652 would not need to
engage in the same type of "nature of the . . . work"
analysis. We generally agree that a court enforcing a law modeled
on H.R. 2652 would not face the wide range of works that are copyrightable
-- from feature films to sculptures to non-fiction scholarly articles.
A court enforcing a law modeled on H.R. 2652 would not face what
is arguably the principle distinction in § 107(2) analysis:
whether the work is fictional or factual.
At the same time, we recognize that there
will still be significant variations in the kinds of databases
that would be subject to this law. We address this point in the
"balancing" discussion below. Similarly, the bill's
proponents have pointed out that the third fair use factor of
§107 calls for analysis of the "substantiality"
of the infringement and that H.R. 2652 largely achieves the same
effect by creating liability only if there has been a "substantial"
taking of the database. Many critics of H.R. 2652 at the conference
seemed to prefer a balancing test that would allow courts to consider
degrees of substantiality in the taking.
The fourth fair use factor under copyright
law is the effect of the use on the "market" for the
protected work. Supporters of H.R. 2652 say that it provides this
test because §1203(d) shields researchers from liability
unless there is "harm" to "the actual or potential
market for the product or service referred to in section 1202."
On this point, we believe that the § 1203(d) exception could
be improved by inserting "substantially" or a similar
standard before "harm" so that any person may extract
or use "information for nonprofit educational, scientific,
or research purposes in a manner that does not substantially
harm the actual or potential market for the product or service
referred to in section 1202." Such a "substantial harm"
standard is familiar to courts; would focus judges on the primary
market for a database; and, in the face of a database owner contending
that "science" or "research" were his intended
markets, would tend to exculpate researchers who used the database.
Another possibility would be an exemption
"for nonprofit educational, scientific, or research purposes
in a manner that does not unreasonably harm the actual
or potential market for the product or service referred to in
section 1202." This test follows the spirit of Article 9(2)
of the Berne Convention that exemptions from copyright protection
are permitted which do not "unreasonably prejudice the legitimate
interests of the author." Yet another option that might be
considered would be to expose nonprofit researchers and scientists
to liability only for harm to an actual market and eliminate any
potential liability for effects on "potential" markets.
There is a reasonable basis for drawing this distinction: commercial
actors are more likely to know the potential market of a competitor
through market research and business planning than nonprofit actors
who are not market participants.
H.R. 2652 also provides exceptions for "extracting
information . . . for the sole purpose of verifying the accuracy
of information independently gathered, organized or maintained
by that person" [§ 1203(c)] and for "extracting
or using information for the sole purpose of news reporting, including
news gathering, dissemination, and comment," unless the information
has been gathered by a news agency for a like purpose -- an exception
to the exception intended to capture the INS case [§
1203(e)]. The bill also includes an express protection for independent
gathering of the same data [§ 1203(b)]. In general, we believe
that these are all reasonable, appropriate, and in the spirit
of fair use. To parallel First Amendment concerns manifest in
copyright law, H.R. 2652 could include an express exception for
"criticism" similar to the existing § 1203(e).
Finally, even if the § 1203(d) exception
largely captures the substantive content of § 107 of the
Copyright law, one of the concerns repeatedly expressed at the
April conference was that H.R. 2652 does not include a "balancing"
mechanism to give judges more leeway in determining what uses
of compilations of data should be shielded from liability. We
would not be opposed to the addition of a balancing mechanism
in H.R. 2652 that indicated to judges that they could exercise
more leeway in considering the "nature" of the work
and the "amount" of the copying above "substantiality"
in determining what kind of liability a non-profit "educational,
scientific, or research" entity should face; the possibility
of such a revision, however, turns on a clear understanding of
how the remedies provisions of H.R. 2652 function.
B. Remedies-Delineated Exceptions
A second area where the fair
use-like elements of H.R. 2652 might be clarified or strengthened
for the benefit of nonprofit researchers and educators is the
bill's remedies provisions. According to proponents of H.R. 2652,
it has already been amended to effectively shield scientists,
libraries, and researchers from monetary damages, i.e. such institutions
and individuals would only be subject to injunctive relief. This
is best seen by a review of the remedies provisions.
I. Civil Remedies
Civil remedies are provided in
§ 1206 of the bill. Subsection 6(a) provides for federal
court jurisdiction "without regard to the amount in controversy";
subsections 6(b), (c), and (d) empower the court to award, respectively,
temporary and permanent injunctions; impoundment and "modification
or destruction of copies"; and defendants profits, treble
damages, and attorneys fees.
Subsection 6(d) also provides that a where
court determines that a database producer brought an action "in
bad faith against a nonprofit educational, scientific, or research
institution, library, or archives, or an employee or agent of
such an entity, acting within the scope of his or her employment"
the court shall award costs and attorney fees against the database
producer. This is clearly intended as a disincentive to frivolous
lawsuits against nonprofit entities. More importantly, subsection
6(e) provides:
"Reduction or Remission of Monetary Relief for Nonprofit
Educational. Scientific, or Research Institutions. -- The court
shall reduce or remit entirely monetary relief under subsection
(d) in any case in which the defendant believed and had reasonable
grounds for believing that his or her conduct was permissible
under this chapter, if the defendant was an employee or agent
of a nonprofit educational, scientific, or research institution,
library, or archives acting within the scope of his or her employment."
We believe that it would be desirable to
consider ways this exception from monetary liability could be
clarified or strengthened. In particular, we think that the following
changes should be considered:
(a) The existing language in subsection
6(d) concerning costs and attorney's fees, including the provision
for mandatory costs and fees against a plaintiff who sued a nonprofit
entity in bad faith could be moved to a new subsection 6(f)
(b) The remaining subsection 6(d) could
be amended to make it clear, immediately, that the monetary damages
described therein are "subject to the limitation described
in subsection 6(e)"; and/or
(c Subsection 6(e) could be amended to clarify
that the burden of proof would fall on the plaintiff to establish
that the defendant knew or had reasonable grounds to know that
its actions were not permitted under the law; and/or
(d) Subsection 6(e) could be amended to
eliminate any initial awarding of damages. As presently written,
subsection 6(e) intimates that a court would first award monetary
relief (damages, profits, etc.) against a nonprofit defendant
and then be required to "reduce or remit entirely" that
monetary relief; and/or
(e) Subsection 6(e) could be amended to
require a court deny any monetary relief absent a showing the
defendant knew or had reasonable grounds to know that its actions
were not permitted under the law.
Any of these changes, singularly or in combination,
could make it easier for nonprofit institutions to establish the
"ground rules" for when they might face monetary liability.
[To the degree that clear ground rules can be established for
researchers so that they know they will, at worst, be subject
only to injunctive relief, we believe that this would substantially
eliminate any "chilling effect" H.R. 2652 might have
on non-profit educational and research activities.
ii. Criminal Remedies
H.R. 2652 includes criminal sanctions in
§ 1207 which provide for a fine up to $250,000 and up to
five years imprisonment. Subsection 1207(a)(2) provides a very
clear exception from any criminal liability for any "employee
or agent or a nonprofit educational, scientific, or research institution,
library, or archives acting within the scope of his or her employment."
We believe that some criminal provisions are desirable to handle
LaMacchia-like situations, i.e. in which judgment-proof
individuals might seek to disseminate protected databases without
any profit incentive. We also believe that the protection against
criminal prosecution for nonprofit entities and individuals is
adequately strong.
The Department of Justice has informally
recommended that § 1207 be amended to distinguish between
"misdemeanor" and "felony" liability, with
the latter available only for damage to a database producer exceeding
$20,000. We understand that Justice is concerned that a statute
establishing a relatively new form of liability should not have
too low a threshold for criminal liability. We think that such
a change would be appropriate, although it will only impact commercial
and private entities and individuals -- not the nonprofit entities
and individuals already exempted from the criminal provisions
of the bill.
5. Consistent with U.S. trade policy,
it is desirable to secure for U.S. companies the benefit of the
EU Database Directive and laws in other countries protecting database
products.
There was much discussion at the April conference
of the effect of the EU Directive's "reciprocity" provision
on American database producers. Unlike in a "national treatment"
scheme, US companies do not automatically enjoy the protections
afforded by the Directive's sui generis protection scheme.
Presently, a database of a U.S. company is protected under the
EU laws only if the U.S. company has a substantial economic presence
in an EU Member State. A recent comparative study from Japan has
concluded that "the existing disparity between US and EU
database protection gives European database producers a distinct
advantage" and that "[i]t may be argued that this reciprocity
requirement enables European database producers to grow by exploiting
US databases as long as the US . . . fails to provide an equivalent
level of protection for European databases."
An American firm that does not enjoy protection
under the EU Directive faces several possible competitive disadvantages.
First and most obviously, its noncopyrightable database may be
duplicated and remarketed by others. Second, European data sources
looking for a firm to "process" and market raw data
will be more likely to enter into a contract with a European company
that can guarantee protection of the database versus an American
company that cannot. Thus, even if the American firm could effectively
protect the database with technology and contract law, it may
be at a disadvantage in obtaining "suppliers" of data.
Could the U.S. force the EU to protect American
databases in the absence of a U.S. database protection law? The
U.S. has already cited the reciprocity provision of the Database
Directive as one reason the EU was placed on the Priority Watch
List in this year's Special 301 review process. Nonetheless, the
U.S. has limited pressure it can bring to bear on the EU. We believe
that the failure of the EU Directive to provide national treatment
probably does not violate TRIPS. Because the Directive offers
copyright protection to databases on virtually the verbatim terms
required by TRIPS (Article 10(2)), the additional protection of
the EU sui generis regime is probably not subject to the
TRIPS national treatment requirement. This means that in order
to protect all U.S. database producers, the U.S. would have to
adopt domestic legislation that the European Commission would
judge to be comparable to the EU Directive.
A set of more abstract arguments is pitted
against the general desirability of giving American firms the
benefit of the EU Directive's reciprocity provision. First, there
is the argument that given U.S. advocacy of national treatment,
we should not condone the EU's use of reciprocity in their Database
Directive because it will embolden both the EU and other countries
to use reciprocity in other policy areas. The concern is that
this would cause a breakdown of the national treatment doctrine
under international law and "further balkanization of data
availability conditions." We agree that there will be some
superficial inconsistency between opposing the Directive's reciprocity
approach and any U.S. adoption of a database protection regime
that appears intended to meet the reciprocity requirement. But
the U.S. often responds to the acts of other countries while disagreeing
with those acts; the true inconsistency with our stated international
policy would only be if a U.S. database protection law required
reciprocity.
The question remains whether H.R. 2652
would be sufficiently comparable to the EU Directive.
We believe that H.R. 2652 offers protection that is equivalent
to the EU Directive and would give the United States a strong
position to insist with the EU Commission that U.S. nationals
enjoy the full benefits of the EU Directive:
Like the EU Directive, H.R. 2652 protects
investment, qualitative or quantitative, in a database [EU art.
7(1); HR § 1202];
Like the EU Directive, H.R. 2652 prohibits
unauthorized takings of the whole or a substantial part
of a database [EU art. 7(1); HR § 1202];
Like the EU Directive, H.R. 2652 permits
insubstantial takings [EU art. 8(1); HR § 1203(a)], but prohibits
unauthorized repeated takings of insubstantial part
of the database [EU art. 7(5); HR § 1203(a)];
Like the EU Directive, H.R. 2652 applies
separately from copyright [EU art. 7(4); HR § 1205(c)];
The EU Directive permits exceptions for
"teaching or scientific research" [EU art. 9(b)] of
the sort set out in H.R. 2652 [HR § 1203(d)].
Like the EU Directive, H.R. 2652 provides
a fifteen year term of protection [EU art. 10; HR § 1208(c)].
Like the EU Directive, H.R. 2652 provides
that it does not alter the effect of any other intellectual property
laws [EU art. 13; HR § 1205(a)].
The principal differences between the two
approaches include:
While the EU Directive establishes a sui generis property
right "located in the neighborhood of copyright," H.R.
2652 adopts a misappropriation approach that targets particular
acts;
The EU Directive appears to permit renewal
of protection for an entire database when the database is revised
[EU art. 10(3)] while H.R. 2652 permits a new term of protection
only for the new elements of the revised database [HR § 1208(c)];
The EU Directive arguably has a narrower
definition of a database than H.R. 2652;
The EU Directive and H.R. 2652 take different
approaches on the exemptions carved out of the protection regime.
We believe, on the whole, that the comparable
aspects of the two regimes far outweigh the differences. The case
that H.R. 2652 provides comparable protection is strengthened
by the fact that direct comparisons are not appropriate: the Directive
provides guidance to the EU Member States for implementing legislation.
Thus, each provision of H.R. 2652 that arguably diverges from
the Directive should be compared to the parallel provision in
each of the fifteen Member States' implementing laws. Only if
all fifteen Member States adopted implementing legislation
completely different from the H.R. 2652 provision would this be
a grounds that the two are not "comparable" in that
respect.
B. OTHER ISSUES
1. Databases Prepared for Scientific
Markets
We believe that there remains at least one
place where the interests of database producers and scientists/educators
may be in a "zero sum" conflict: how to handle collections
of information specifically prepared and marketed to scientists
and educators. The problem is apparent in the § 1203(d) exception
that shield "extracting or using information for nonprofit
educational, scientific, or research purposes" as long as
such activity "does not harm the actual or potential market
for the product or service referred to in section 1202."
Many people have pointed out that this does not exempt from liability
extraction/use from databases marketed to the nonprofit
scientific or research communities.
This is a place where the desire to provide
proper incentives for the production of databases runs squarely
into the desire to provide as much as access to information as
possible to researchers and educators. If a commercial firm creates
a database intent on educators/researchers being a substantial
part of the market for that database, then consistent application
of the incentive rationale requires that the firm have the same
protection against educators/researchers that it would have against
others in the marketplace. This is also consistent with Congress'
recognition that a number of types of copyrighted works -- such
as informational newsletters targeted to particular audiences,
textbooks, testing materials, and other materials prepared for
the school market may not enjoy as wide a range of fair use as
other types of materials.
2. "Sole-Source" Database Issues
Both prior to and during the conference,
the debate over database protection has frequently turned to the
issue of "sole source" databases. Critics of database
protection proposals have often advocated that databases which
are the only source for certain types of information should be
treated differently from other databases. The argument is that
otherwise, any "sole source" database protection scheme
would create a monopoly over access to the facts in these sole-source
databases. A frequently heard proposal is that such sole source
databases should be subject to some type of mandatory licensing
system.
There is an initial problem in defining
what is meant by a "sole source" database. Is it an
absolute sole source for the data? Or is it a practical
sole source for the data? We believe that there is a tremendous
difference between the two and that critics of database protection
frequently use the former extreme cases to advocate mandatory
licensing or similar restrictions on a broader range of compilations.
Examples of an absolute sole source
database would be, for example, (a) measurements of solar flares
during a specific period that were done at only one telescope,
(b) temperature and air content measurements made inside a cave
by the initial spelunkers who discovered it and opened it to the
surface, (c) historic climatological measurements for the specific
location that were made by only one party. In fact, scientific
measurements are among the most likely candidates to be absolutely
unique datasets. There are also many unique sources of historic
data, i.e. the Mormon Church's genealogical records might qualify.
If it is correct that these are the vast
majority of true sole-source databases, then access to information
in sole-source databases may not be a significant issues in any
database protection regime which (a) does not apply to government-funded
data and (b) which has a reasonably defined sunset on database
protection rights. Critics of database protection have,
however, broadened their view of "sole-source" databases
to include those where, while the raw information still exists
in the world and could be collected independently, the information
has been collected and commercialized by only one party. The argument
is that the information is, for practical purposes, under the
control of a single entity and because there is no competition
the database owner will extract monopolist rents from users.
The problem with this argument is that it
cuts too wide. There will inevitably be many small markets that
can only be viably served by one firm; we should expect that the
number of such niche markets will only increase with time. Instituting
a mandatory licensing system would, in effect, penalize those
who are "first to market" in serving these niche demands.
It is undesirable to create an IP regime that dissuades firms
from entering such small markets. Our country takes, for example,
the opposite approach with the "orphan drug law" --
which is intended to give firms an incentive to fill and stay
in niche markets for which R&D costs cannot be easily recovered.
Similarly, in the copyright field, there has been recognition
that fair use should be drawn more narrowly when the producer
of the work is supplying a small market.
H.R. 2652 offer a limited response
to possible sole source monopolist pricing by expressly providing
in section 1205(d) that nothing in the statute effects "Federal
and State antitrust laws, including those regarding single suppliers
of products and services." This raises a minor concern: under
patent and copyright law, courts have developed "misuse"
doctrines independent of antitrust law. Does the express mention
of antitrust law in H.R. 2652 preclude a "database protection
misuse" doctrine? We think the answer is unsettled, albeit
probably 'no.' To clarify this possible ambiguity, we suggest
that § 1205(d) be written in a way as to ensure that courts
remain free to develop any equitable doctrines doctrine that would
be appropriate in this area. We think that this would be the easiest
way to unambiguously preserving the possible use of doctrines
like unclean hands or "misuse" against database producers.
If such language were not adopted in the
act, we would recommend that the legislative history make clear
that express consideration of the antitrust laws in the statute
does not prevent the courts from denying relief to a database
producer on equitable grounds and the possible development of
a "database protection misuse" doctrine.
3. Distinguishing Protected from Unprotected
Material: the issues of "perpetual" protection and value-added
compilations of government-generated data
One of the places where a neutral observer
might wonder if the sides are speaking about the same issue is
the question of the duration of protection. Critics of
database protection frequently claim that a regime of "perpetual"
protection would be created or that proposals call for protection
greater than copyright protection --- yet the current legislative
proposal calls for a 15 year duration (and copyright endures for
the life of the author plus 50 years). For reasons we will explore
below, this problem has certain contours in common with the issue
of privately-held, sole source databases from government-generated
data.
The critics' concern about "perpetual
protection" is rooted in the need to provide some type of
protection for revisions of databases. If legislation were
passed that provided protection to new databases, but did not
provide protection to revision of databases, this would skew investment.
There would be a disincentive to revise proven, useful databases
in favor of creating new databases. Reassembling (largely) the
same information in a new database would be inefficient not only
for data gatherers, but for data users who -- in order to use
the most current data -- would have to accustom themselves to
the format of the new database. The drafters of H.R. 2652 believe
they resolve this problem with the general definition of what
is protected and the 15 year statute of limitations:
"[N]o action can be maintained more
than fifteen years after the investment of resources that qualified
that portion of the collection of information that is extracted
or used. This language means that new investments in an existing
collection, if they are substantial enough to be worthy of protection,
will themselves be able to be protected, ensuring that producers
have the incentive to make such investment in expanding and refreshing
their collections. At the same time, however, protection cannot
be perpetual; the substantial investment that is protected under
the Act cannot be protected for more than fifteen years. By focusing
on that investment that made the particular portion of the collection
that has been extracted or under eligible for protection, the
provision avoids providing on-going protection to the entire collection
every time there is an additional substantial investment in its
scope or maintenance." (Legislative Report)
We believe that this does not wholly address
the concerns of those who believe that the bill could create "perpetual
protection." While the bill provides no de jure perpetual
protection, many users believe that the digital environment might
be manipulated in some situations to produce de facto perpetual
protection.
This potential problem is limited to a discrete
set of databases. Some databases are revised extensively and constantly;
for these databases, the value of the database is much shorter
than 10 or 15 years. Stock exchange price listings are the most
extreme example, but other lists -- realtors' sale listings and
used car valuations also fall in this category. Other databases
will be revised rarely, if ever, once a definitive version is
completed, i.e. a database of Union warships in the Civil War
or the passengers on the Mayflower. The databases for which
the "perpetual protection" problem arises are ones that
have value over many years and require substantial, but not total,
revision. An example would be a historical database of the batting
statistics of all baseball players in the major leagues or a database
of medical compounds. Our understanding of the "perpetual
protection" problem with these databases is as follows.
In the classic case of a copyrighted book,
the text loses protection at the end of its term, although new,
revised versions of the text may enjoy fresh periods of protection.
This means that one can find unprotected texts of Antigone
or Pride and Prejudice in libraries all over the country.
At the same time, new versions of these books can be under some
copyright protection (including new introductions, translations,
"notes," artwork, etc.) It is possible to compare the
two versions -- old, unprotected and new, protected -- side-by-side.
In the digital, on-line environment, content
producers may chose not to alienate copies of their works; instead
access to a database may be licensed to users. The advantage is
that the database user can receive the most current version of
the compilation. The disadvantage is that the user may lack access
to any old version of the database in which to compare old and
new entries.
Imagine that in 2000, a database producer
makes a database; we will designate the first twelve entries alphabetically:
A
B
C
D
E
F
G
H
I
J
K
L
In 2003, it "expands and refreshes" the database, so that the first fifteen entries are as follows:
A
B
BB
C
D
E
F
FF
G
H
I
J
K
KK
L
In theory, under H.R. 2652 in the year 2016,
all of the entries except BB, FF, and KK lose protection -- and
can be copied in their entirity. The problem is that if the database
is provided via on-line services, there may be no means for the
user to know which entries are unprotected because they were original
entries and which entries are protected because they are the result
of maintenance investment within the past 15 years.
Critics of database protection are correct
to point out that this could produce "chilling" effects
on those who want to use the database after the initial term of
protection. One commentator has suggested that new entries by
electronically "tagged," so that a user can readily
determine what is protected and what is not, i.e.
A
B
BB
C D E F
FF
G H I J K
KK
L
To the extent they have considered this
idea, the protection advocates have not been favorable to the
"tagging" idea. We too recognize that it might create
substantial technological problems or costs, depending on the
database.
Another possible solution would be to require
any database producer that wanted to enjoy protection for a revision
of their database after the fifteen year period to make (or have
made) the original, no longer-protected database available in
a reasonable format. This would be the electronic equivalent of
the old copy of Wuthering Heights in the public library.
The original database need not be as available as the new
version -- just as old library books usually are not as available
as books at retail stores, but it should reach some standard of
public access. On this count, it is possible that the problem
of "perpetual protection" could be addressed by establishing
a limited, well-defined archiving right for libraries, possibly
taking ideas from 17 U.S.C. §108 and §403 of the Digital
Millennium Copyright Act, which modifies 17 U.S.C. §108 to
cover digitized archiving.
This archiving approach does not, however,
resolve the similar problem that could arise when (a) a private
entity adds value to government-generated information, (b) distributes
the new, value-added compilation, and (c) the government withdraws
from supplying the data to the public. In such situations, there
is the possibility that the private entity will use a minimal
amount of value-added processing to claim that the entire compilation
of information is protected. This could frustrate the goal of
making government-generated data widely available; at the same
time, we do not want to adopt any regime which will take away
from incentives to "value-add" to government-generated
data. Copyright law has addressed a parallel problem: mixtures
of privately-generated (copyrightable) materials with government-created
(noncopyrightable) materials. In such cases, 17 U.S.C. §
403 provides that where a work is "predominantly" U.S.
Government material, the copyright notice should include a "statement
identifying, either affirmatively or negatively, those portions"
protected under the copyright law as contrasted with the "works
of the United States Government". If a copyright holder fails
to include such a statement, 17 U.S.C. § 403 provides that
the defendant in an infringement action can claim a defense based
on innocent infringement to mitigate any damages. We think that
it would be appropriate to consider whether a similar provision,
possibly linked to "tagging" or otherwise identifying
government-generated data, should be included in H.R. 2652.
#####