Focus: Toxicity Tests in Animals: Extrapolating to Human Risks

Editor's Note: This article is the third of a three-part series on the use of animal models in toxicity testing. The first part explored traditional uses of animals in toxicity testing; the second part discussed the use of alternative to animal models in human risk assessment.

Controversy continues to be a part of toxicity testing in the United States and elsewhere. The usefulness of animal results for predicting human risks and whether animal tests should be used at all are just two of the contentious issues that have pit scientists, industry representatives, government regulators, and environmentalists against each other.

The economic, social, and political stakes are enormous. Industry spends hundreds of billions of dollars to comply with regulations for minimizing risks to workers and the environment, for health care for those affected by exposure to toxic materials, and for litigation costs. Lawsuits over dioxin exposure alone (including the infamous Agent Orange lawsuits from the Vietnam War) are still pending, with a potential price tag of $500 billion for eventual settlements. Many chemicals released into the environment are not tested at all. Therefore, it is imperative to determine the most accurate, efficient, and cost-effective ways of assessing risks to humans.

Given the importance of toxicity testing and regulation and the furor surrounding these activities, the responsibility of the scientific community to help shape reasonable solutions becomes paramount. To do this, science must offer solid data or the best approximations of risks of environmental exposures to inform other arenas (e.g., political, social, economic) that determine the direction of social policy. Science must seek to answer the questions of whether a substance is toxic; if so, what are the mechanisms of toxicity; how much exposure makes it toxic; and what are the best methods for estimating risk.

One indisputable fact is that the whole- animal model is still the most widely used method for assessing potential toxicity risks to humans. This is true whether the testing is designed to measure the carcinogenicity of an environmental hazard such as pesticides or the potentially toxic effects of a new drug used to treat a human disease. Alternatives to animals are being developed, but animals still play the main role as research subjects in human risk assessment [see Part 2 of this series in EHP 101(4)].

The efforts of the scientific community to defend the use of animal models are undergirded by the belief that there are many similarities between species, which supports the use of animal tests to predict effects in humans. For example, in 1975 Schmidt-Nielsen published in the Journal of Experimental Zoology a summary of the biological and biochemical similarities between species, including cell structure, energy metabolism, and transmission of genetic information. These similarities appear despite the more obvious dissimilarities in size, weight, and other factors between, for example, a laboratory mouse and a human. The principle underlying the use of animal tests for human risk assessment is often referred to as the principle of phylogenetic continuity.

Animal to Human Extrapolations

Nonetheless, animal-to-human extrapolation has its problems. There are a variety of difficulties faced by regulatory agencies that attempt to use results of animal studies to predict human risks.

Gerhard Winneke and Hellmuth Lilienthal at the Heinrich-Heine University in Dusseldorf, Germany, have pointed to differences in neurotoxicity testing between rodents and primates. John Caldwell at St. Mary's Hospital Medical School in London has stated that interspecies differences in metabolism may be the single most complicating factor in using animal toxicity data to identify human hazards. In 1992, Caldwell evaluated earlier research on the liver to toxicity of the food flavoring agent cinnamyl anthranilate, which had been used since the mid-1930s, and was identified in 1980 as a potent liver carcinogen in mice. More detailed subsequent testing showed significant differences in carcinogenic effects between mice and rats, however, and even more differences in metabolism when volunteer human subjects were used. Eventually, a ban against cinnamyl anthranilate use was lifted because the animal-to-human interpretation was found to be unreliable.

Caldwell holds hope for the increasing use of "designer" animal species wherein gene transfer could produce laboratory animals bred for specific physiological and genetic traits [see "The Use of Transgenic Mice for Environmental Health Research," EHP 101(4)]. Although he believes that toxicology poses special problems for the use of genetically modified animals, such as the complexity of metabolic pathways of toxic chemicals, he believes it may lead to animal models with far more predictive power for human risk assessment than we have today.

Despite such reports, scientists in various fields have found more advantages than disadvantages in using animal data to predict risks to humans. For example, Judi Weissinger, formerly at FDA's Center for Drug Evaluation and Research, and now at Glaxo, examined numerous research studies involving animal-to-human safety evaluation interpretations for drugs. Weissinger identified several well-known drugs for which cross-species results are available. Animal data were considered in formulating the dose and experimental design for the human trials for all of these drugs. In most cases, the animal finding predicted effects in humans.

105x135

Judi Weissinger--Researchers must justify selection of species to estimate risks. NIEHS

Some effects observed with animals in the studies reviewed by Weissinger occurred in exposed humans even when toxicological tests with a small number of human subjects did not detect these effects. For example, in the case of methotrexate, used to treat bone cancer and rheumatoid arthritis, animal tests showed liver toxicity effects. Direct tests of human liver function failed to detect toxic effects, which were shown later from liver biopsies. In contrast, however, a test of another unapproved drug gave completely different results in animals and humans.

Weissinger's main point is that it will always be up to the researcher to take into account toxicodynamic factors to justify selection of the animal species and the validity of the animal-to-human prediction. Although her findings concern drugs rather than environmental toxins, there are some similarities in the approaches taken by FDA and those taken by EPA. Weissinger notes, "FDA's Center for Food Safety and Applied Nutrition is one of the FDA centers that uses a quantitative risk assessment similar to that of EPA."

Maximum Tolerated Dose

It is not economically feasible to test chemicals at low doses because enormous numbers of animals would be required to detect statistically significant increases in cancer incidence. Therefore, it is essential that higher doses are used to determine whether a chemical is a carcinogen. However, the use of high doses to assess human risks has frequently been debated. Usually, the experiments are done in rats and mice at the highest dose that will not shorten the life of the animals but that is expected to cause them to gain 10% less weight than control animals. This dose is known as the MTD (maximum tolerated dose), and the studies frequently use the MTD and 1/2 MTD. To estimate human risk at exposures to the chemical that are frequently thousands of times lower than the MTD, a linear extrapolation is used. The MTD is used to give the greatest chance of observing an effect of the chemical in animals. Linear extrapolation is used to obtain a hypothetical, maximum risk to humans at various exposure levels, including especially low exposure levels, as a basis for regulatory policy.

However, "testing for carcinogenicity in animals at near toxic doses does not give enough information to predict the excess number of cancers from low doses typically experienced by humans," said Lois Gold, director of the Carcinogenic Potency Database Project at Lawrence Berkeley Laboratory. Gold has serious reservations about the effects at the MTD because high dosing may increase the number of tumors. "Near-toxic doses can frequently cause cell division due to cell killing and consequent cell replacement, and these effects can be unique to high doses. Cell division itself can increase the chance of mutations and tumors, and then the effects at low doses are likely to be much less than a linear model would predict and may often be zero. It is important to add measurements of cell division to animal cancer tests and "to use the results to estimate low-dose risks more adequately," said Gold.

105x135

Lois Gold--Use of MTD not sufficient to predict low-dose risks. Jane Scherr

In response to criticism of the continued use of the MTD in long-term carcinogenesis bioassays, it has been pointed out that the MTD is often misunderstood and more complex than is realized. James Huff of the Environmental Carcinogenesis Program at NIEHS stresses that the definition of the MTD concept written in 1976 by the National Cancer Institute is still as good and relevant today as it was then. Yet, most people seem to single out the "body weight reductions" aspect and rarely explore the many other scientific facets of the working definition. Huff insists that many additional factors, such as weight gain or loss, clinical and biochemical signs of toxicity, organ weights and function, absorption, distribution, and excretion characteristics, go into the selection of the MTD. This prospective estimate should be viewed as a guide rather than as an absolute.

105x135

James Huff--We should continue to use MTD to minimize risks to humans.NIEHS

"Still, to minimize possible risks to humans and specifically to identify potential carcinogenic risks for humans," Huff said, "we should continue to use the MTD concept for evaluating the carcinogenic potential of chemicals in laboratory animals." Doubts about the perfection of the MTD should not be the cause to change these dose-response studies or to alter the experimental identification and measurement of potential human risks using animal models. Instead, such concerns should be taken into account when applying the results during the preliminary stages of the risk assessment process while also using all other available data on the chemical.

This was also the recommendation of the National Research Council of the National Academy of Sciences, which advocated continued use of the MTD in the overall strategy for toxicity testing in a report released in February. In a split decision, a majority of the council recommended that metabolic and physiologic studies be conducted when initial test results at the MTD warrant further study. A third of the panel disagreed, arguing that such studies should be conducted first and then dose regimens established on the basis of the results. It appears that the MTD approach of testing chemicals in animals will continue to be an integral part of predicting potential cancer risks to humans.

Dose-Response Relationships

Perhaps most controversial is the basic question of the methods used to estimate human risks from animal testing data. This is particularly true regarding the hazards posed to humans by chemicals that might cause cancer, whether exposure occurs in the general environment, the workplace, or the home. Two different methods, with quite different results for predicting risks for humans, are the most widely used.

The first method, referred to as the "safety factor approach," has been in use in the United States since the 1930s. At that time, when pesticides were beginning to be viewed as possibly dangerous for humans as contaminants in the food supply, a method was needed to establish the maximum amount humans could be exposed to without risk. Setting a toxicity threshold for the amount of the dangerous substance was difficult, given the size of the human population and various sensitivities of different groups to the same substance. So, how could one extrapolate test data from a few animals to all those humans?

A solution proposed in the 1950s by Arnold Lehmann of the FDA was to use a 100-fold safety factor. Lehmann proposed first determining a threshold dose level, beyond which a substance was toxic for animals. Then he divided that amount of the substance by 10 because of the assumption that some people might be more sensitive to the substance than animals. Next, he divided the smaller level by 10 again, since some people might be more sensitive than other people. The result was a 100-fold lower amount of the substance as an acceptable exposure level for people.

Also in the 1950s, William Huper proposed an alternative way of estimating risks to humans from carcinogens. This alternative approach generally is referred to as the linear model, in which the risk to humans is estimated from the dose-response relationship in animals. The risk is viewed as proportional to what was observed in animals given high doses of the substance. Huper postulated that carcinogens can damage DNA leading to mutations in cells representing an early but necessary step in the development of cancer. According to the generally accepted theory, these genetically altered cells would grow more rapidly than normal cells, eventually producing a tumor many years after the chemical exposure had occurred. Since a single mutation could theoretically cause a tumor, there would be no safe level of a carcinogen. Support for this model has been provided by evaluation of cancer risk in people exposed to varying levels of radiation and some chemicals that cause mutations in cells. There is little disagreement among scientists and regulatory officials that risks for those chemicals that cause cancer by increasing the mutation frequency of critical target genes should be estimated using linear models. However, some feel that those chemicals which are carcinogens because they increase the expansion of genetically altered cells or induce mutant cells by indirect mechanisms should be regulated by a safety factor approach. Still others argue that the growth rate of genetically altered cells can be enhanced by many different mechanisms and in some of those cases, receptor-mediated increases in cell division for example, a linear model might not be most appropriate. In addition, said Carl Barrett, chief of NIEHS's Laboratory of Molecular Carcinogenesis, "chemicals acting by mutagenic mechanisms exhibit nonlinear dose responses. Therefore, the dose response of each chemical must be determined, and it is not possible to assume nonlinear responses on the basis of putative mechanisms, for example, genotoxic or nongenotoxic."

The safety factor approach, where the toxic dose in animals is divided by at least a factor of 100 to estimate human risk levels, is favored by some and strongly opposed by others. "I don't really see anything happening to resolve the differences between these two models," said Joseph Rodricks, director of health sciences at the consulting firm Environ. "We shouldn't be looking at ways to estimate risks that depend on whether the substance is a carcinogen or a noncarcinogen anyway. Instead, there is evidence that suggests there may be carcinogens that have thresholds and those that do not. We should be looking at the underlying biological mechanisms to determine the most appropriate models for assessing human risks.

Rodricks added that since there is no direct way to measure the exact threshold for human risks (especially at the low doses often experienced by humans), agencies like EPA, FDA, and OSHA have taken the cautious approach (using the more conservative linear models) to estimating levels of a substance that might be toxic to humans, "until someone can prove differently."

"The safety factor approach is subjective and does not use all of the available information when establishing a safety level," said Christopher Portier of NIEHS. "This approach fails to consider the observed pattern of responses, whereas the evaluation of dose-response relationships permits incorporation of biological data."

Portier highlighted the research design-dependency problem of the studies that are used in the safety factor approach. Because the number of animals used in such studies is often small (e.g., 10-50 per group), simply increasing the sample size will increase the chance of finding statistical significance between groups. For example, if a specific dose is given to a group of 10 animals, 3 may develop a tumor, and 1 of the 10 control animals may also develop a tumor. A statistical test comparing the two groups might not find the 1-in-10 versus 3-in-10 result to be statistically significant. If the sample sizes are increased to 100, and the same proportions hold true, 30 of the experimental animals will develop a tumor versus 10 in the control group. That difference in numbers would likely be statistically significant.

The point of Portier's argument is not that all studies should use more animals, but that the safety factor approach may be flawed. Just changing the sample size could yield a completely different result, although the underlying slope of the response curve for the substance would be the same.

The implications of which model to use are significant. For example, the government-regulated safety exposure level of dioxin varies greatly from country to country. In fact, some countries allow more than 1000 times the acceptable U.S. levels, even though the same data (liver tumors in female rats) were used worldwide to estimate risks. Differences are caused by the selection of the models to predict the risks.

Dose-response modeling is slowly becoming more common, and an effort is underway at both NIEHS and EPA to evaluate past studies to more carefully define dose-response relationships. It is possible that this collaborative agency effort may begin to shed light on the controversy of the safety factor versus linear dose-response models.

Biomarkers and uncertainty in risk assessment. Experimental data obtained at high doses must be extrapolated to predict effects at low doses, which can produce uncertainty. Biomarkers that can often be quantified in the low dose region are useful in estimating the shape of the dose-response curve. Joseph Tart

Molecular Toxicology and the Future

The emerging field of molecular toxicology is cited by several leading scientists in the field as holding promise for the future of human risk assessment. Using molecular biology methods to evaluate markers of toxic response may offer a more accurate way of assessing human risks.

Molecular toxicology has also been useful in animal-to-human risk extrapolations. For example, laboratory mice used by NTP researchers for testing substances that might cause liver tumors in humans have a background rate of liver tumors of about 20-30%. When using data from such animals, researchers must be able to separate out these "normal" tumors from ones that might be caused by the substance being tested. If liver tumors increase in these mice during a test, it is not immediately clear whether the substance simply increased their already heightened susceptibility to liver tumors or actually introduced a new risk.

By sequencing the ras oncogene in the tumors of such mice, NIEHS researchers detected pattern differences between experimental and control animals. That is, some liver tumors in animals receiving the same chemical show the same pattern in the oncogene as do control animals, but other tumors show patterns unique to the chemically exposed animals. Therefore, the unique mutation pattern in the tumors strongly suggests that the test substance did more than just stimulate an existing proclivity toward tumor growth. Seldom is one such test conclusive, but molecular toxicology is a powerful tool for estimating risks for humans, whether in direct human studies or in animals.

"It is extremely important to pursue recent advances in molecular biology to enhance our understanding of comparative toxicology," said William Farland of EPA. "This will help us make use of the animal tools that are available for evaluating human risks. EPA is interested in getting the most out of animal tests to predict human risks. We cannot just look at toxicology endpoints anymore," he added.

According to Barrett, "Studies of molecular carcinogenesis support the concept that carcinogenesis is a multistage, multicausal, and multimechanistic process. Attempts to develop simple models for this complex process will probably fail. If anyone pronounces that they know the mechanism of a carcinogenic agent, my guess is that they will be proven wrong, because most, if not all, carcinogens will operate through multiple mechanisms. The dose- response curve for the chemical will be difficult to predict because the rate-directing mechanism may be more important than the most obvious mechanism."

The use of molecular toxicology in risk assessment has been addressed by the prestigious International Agency for Research on Cancer (IARC), based in Lyon, France. In 1991, IARC assembled a working group composed of leading cancer researchers from a variety of disciplines. This working group came to the following consensus:

When the available data on mechanisms are thought to be relevant to evaluation of the carcinogenic risk of an agent to humans, they should be used in making the overall evaluation, together with the combined evidence for animal and/or human carcinogenicity. It is not possible to elaborate definitive guidelines for all possible situations in which mechanistic data may influence evaluation of carcinogens. The following scenarios are illustrative of the range of options available. First, information concerning mechanisms of action may confirm a particular level of carcinogen classification as indicated on the basis of epidemiological and/or animal carcinogenicity data. Second, for a particular agent, strong evidence for a mechanism of action that is relevant to carcinogenicity in humans could justify 'upgrading' its overall evaluation. Third, an overall evaluation of human cancer hazard on the basis of animal carcinogenicity data could be downgraded by strong evidence that the mechanism responsible for tumor growth in experimental animals is not relevant to humans. In keeping with the goal of public health, priority must be given to the demonstration that the mechanism is irrelevant to humans.

According to James Swenberg of the Laboratory of Molecular Carcinogenesis and Mutagenesis at the University of North Carolina at Chapel Hill, it is becoming clear that the scientific and regulatory communities are becoming more comfortable with using molecular data in the risk assessment process. Draft guidelines on cancer risk assessment being developed by the EPA place renewed emphasis on incorporating mechanistic information into the risk assessment process. Under these proposed guidelines, straight mathematical extrapolation of risk is relegated to a default position to be used in the absence of mechanistic data. Examples of potential uses of mechanistic data include understanding the roles of metabolic activation, detoxification, DNA repair, cell proliferation, and receptor-mediated responses in high to low dose, route to route, and species to species extrapolation, as well as genetic predisposition of animals and individuals to these and related processes relevant to carcinogenesis. The extent of uncertainty associated with straight mathematical risk extrapolation should clearly be reduced by turning to such biologically based methods for predicting carcinogenic risks.

Development of biologically based risk assessment models. Data from multiple sources are used to construct models. Knowledge gaps are identified, which when filled lead to refined models with less uncertainty. Joseph Tart

Social Science and Environmental Risks

The increasing complexity of assessing risks in humans, including recent developments in genetic research and molecular epidemiology, requires more interdisciplinary efforts than ever before. The push to combine expertise from multiple fields has been advocated by such leading figures in environmental studies as Roger McClellan, president of the Chemical Industry Institute of Toxicology.

McClellan believes that most of the easy problems have already been solved. The difficult remaining problems will require teams of researchers from various disciplines, but biological scientists often find that partnership difficult to achieve (see EHP, 101:26-29).

Whatever challenges biological scientists may face in forging such teams (including the problem posed by funding sources that restrict the activities they will support), scientists need not feel alone in their dilemma. Their research colleagues in the social sciences face similar challenges as they study other parts of the "human equation" in environmental risk assessment.

Psychologists in particular have been studying perceptions of risk by humans, their reactions to such perceptions, and factors that influence human behavior after exposure (or suspected exposure) to carcinogens and toxic materials in the environment. While some of this interest on the part of U.S. social scientists may have its roots in the burgeoning environmental awareness and activism of the 1960s, it has more recently focused on health risks posed by the same chemicals or materials being studied by biologists, chemists, and others in the biomedical sciences.

The fact that exposure to toxic agents is not randomly or uniformly distributed in the United States but is instead often concentrated in minority and low-income communities, means that cultural, sociological, and psychological factors may determine how laboratory results are used in society.

The social science literature has seen an increase in research findings related to human risk assessment that is at least analogous to the increase seen in the biomedical literature. For example, in the June 1993 American Psychologist, E. Vaughan described how different groups of human subjects have adapted differently to environmental hazards. ("Individual and Cultural Differences in Adaptation to Environmental Risks," pp. 673-680). She also argued for the use of such data by public policy makers in forming or changing regulations that affect environmental controls, thus implicitly arguing for cross-discipline collaboration

In the same vein, A.H. Wandersman and W.K. Hallman examined the pitfalls of focusing exclusively on quantitative risk assessments when the "human" dimension clearly shows more vagaries than the controlled lab ("Are People Acting Irrationally?" American Psychologist, June 1993, pp. 681-686). In attempting to measure fear of cancer and resultant human behaviors, the authors approached the question of how much toxicity is too much? from a totally different perspective than, for example, a biologist or chemist might have used in a lab.

In other words, no matter what acceptable level of exposure to a substance might be established by regulation, based on laboratory studies with various dose levels, any level may be too much for the individual who learns of it. Therefore, the decision of what substances must be studied, how the related research will be funded, who will support such funding, and what will be done with research findings may be driven by human fears and needs as much as by scientific curiosity and method.

If so, the call for interdisciplinary studies made by McClellan and others may well include the social and behavioral sciences as well as the biomedical sciences. If such a goal becomes important enough, the academic, research, and industry segments of society may engage in a complex partnership never seen before. Given the increasing scrutiny being levelled at many health science research budgets, it may be that such a partnership will not be a luxury, but an indispensable mechanism for public support.

Dennis Maloney

The role of animals in predicting human risks probably will continue to be vital. Although the use of animals as research subjects has its limitations, no completely suitable substitute has been found. "The results from research using animals must be interpreted cautiously and in light of any other data on the compound," said I. Bernard Weinstein, director of the Columbia-Presbyterian Cancer Center in New York. "But, despite the criticism, we should not abandon long-term rodent bioassays used as a screen for potential carcinogens in humans. Criticisms of this approach are overdone, and the National Toxicology Program, which does most of the rodent bioassays, interprets findings cautiously and is aware of the pitfalls."

105x135

L. Bernard Weinstein--We should not abandon long-term rodent bioassays. Columbia University

Weinstein pointed to the most important consideration in the animal-to-human leap: does the approach help humans? "Using the animal model has been useful in predicting certain human carcinogens in the past. For example, vinyl chloride and diethylstilbestrol were identified as carcinogens in rodents years before they were finally found to be carcinogens for humans."

This approach is not only relevant for the past, but perhaps for the future as well. "There is a controversy now regarding certain pesticides and herbicides. They have already been proven positive as carcinogens for rodents. This has alerted epidemiologists that they might be carcinogenic for humans as well, and recent epidemiologic studies indicate that this may be the case," Weinstein said. "Since almost all known human carcinogens are also carcinogenic for rodents as well," Weinstein's warning suggests that abandoning the animal model now may be tantamount to throwing out the baby with the bathwater.

Teamwork and a Continuum of Risk and Prevention

In addition to the call for multiple data resources, many groups are taking a stance for more cooperation between academia, industry, and the government. Roger McClellan of the Chemical Industry Institute of Toxicology highlighted the need for cooperative efforts by many different scientific disciplines in toxicology testing by using the concept of a continuum of risk. This continuum begins at the basic assessment level and extends up to and includes prevention of risks to humans.

"If we look at environmental and occupational health issues, we can look at a four-stage process," McClellan said, "covering source, exposure, tissue dose, and response," each of which requires the input and expertise of scientists and other professionals from many different disciplines. The challenges posed by this continuum, McClellan noted, aside from the sheer logistical problems in having many disciplines routinely interact, include the problems caused by the different jargon used by the professionals. Even when using the same words, different scientists don't always mean the same things. "We need to find ways to get these different disciplines working together," McClellan added.

The need for such collaboration is increasingly recognized in the government sector as well. Speaking at CIIT's annual meeting in May, Michael Taylor, deputy commissioner for policy, FDA, stressed the need for such cooperation. FDA conducts collaborative research, as well as establishes advisory committees with representatives from outside the agency. Other major federal agencies such as EPA have developed similar mechanisms to facilitate communication between government, industry, and academia.

Such interaction between scientists, industry leaders, and government policy makers can only help bridge the gap between results found from animal toxicology tests and subsequent predictions of risks for humans. Leaping that gap is not impossible, but it is difficult. Advances with cell and tissue cultures, computer modeling, and genetic research continue to help reduce the need for animals to test substances that can harm humanity, but the advances probably will not totally eliminate that need. As long as we value human life, that need will persist as scientists continue trying to ensure that the contributions made by such animal research subjects lead to practical benefits for a healthier society in a less risky environment.

Portions of this article were provided by Dennis M. Maloney, president of The Deem Corporation.

Health Risk Assessment Research: The OTA Report

The 1950s, a time of placid prosperity, was also an era in which the United States awoke to health threats in its environment. Out of that awakening came a scientific discipline that now, decades later, determines not only how much residual pesticide may safely be allowed in an orange, but also attempts to define how much exposure to carcinogenic chemicals may eventually lead to death from cancer.

Health risk assessment research is based on a multidisciplinary alliance of physics, chemistry, biology, genetics, geology, pharmacology, pathology, and statistics. This alliance, the basis for a whole new field of analysis, grew out of the need for courts, industry, and government to respond to the demands of the public to quantify the potential effects of toxic substances and radiation on human health, as well as to find some way to judge acceptable limits of exposure.

As such, the practice of health risk assessment is only about 20 years old; yet, its methods and principles are widely used in policy decisions that affect millions of lives and involve hundreds of billions of dollars. How far the field has come, how useful it has been in regulating the substances considered dangerous to health or the environment, and how helpful it has been in identifying these substances are all assessed in a report from the Office of Technology Assessment, which will be published in the fall.

In an early draft of its report, the OTA concludes that health risk assessment research is necessary; however, major gaps in its practice and application need to be closed in coming years. Most pressing is the need for more research in environmental health to create, identify, or develop better methods to be used in defining and assessing risk; this is referred to as methodological research.

The second area in which there is room for improvement is policy making. Health risk assessment is now conducted by many scientists in many agencies. The administrators of these agencies must communicate across bureaucratic boundaries to acquire or conduct relevant research and generate the best risk assessments. Similarly, OTA says, government agencies now set their own separate agendas, rarely working together to solve common problems; turf battles are not unknown. Although different federal statutes govern the kinds of risk assessments done by the agencies, therefore limiting some of the administrators' control, the agencies must somehow end their isolation and learn to work together.

Risk Assessment:You Have to Speak the Language

Risk--the probability of an adverse health effect as a result of exposure to a hazardous substance.

Risk Assessment--the use of available information to evaluate and estimate exposure to a substance and its consequent adverse health effects. Risk assessment consists of hazard identification, exposure assessment, dose-response assessment, and risk characterization.

Hazard Identification--the qualitative evaluation of the adverse health effects of a substance(s) in animals or in humans.

Exposure Assessment--the evaluation of the types (routes and media), magnitudes, time, and duration of actual or anticipated exposures and of doses, when known, and, when appropriate, the number of persons who are likely to be exposed.

Dose-Response Assessment--the process of estimating the relation between the dose of a substance(s) and the incidence of an adverse health effect.

Risk Characterization--the process of estimating the probable incidence of an adverse health effect to humans under various conditions of exposure, including a description of the uncertainties involved.

Risk Management--regulatory decision that incorporates information on benefits versus risks of exposure to certain situations.

Origins of Health Risk Assessment

Opinions differ on how the process of risk assessment began. But it is agreed that the Delaney Amendment to the Food, Drug and Cosmetic Act, enacted in 1958 after Congressman James Delaney's impassioned plea for control of carcinogens in foods, provided the initial impetus. The New York Democrat's wife had just died of cancer, and Delaney wanted to limit the entire population's exposure; thus, the law that bears his name states that any substance shown to be a carcinogen cannot be added to food.

Existing law at the time stated that food and drugs must be safe, and the Food and Drug Administration regulated substances in foods at the level of parts per million. As the science of analytical chemistry advanced in the 1960s and 1970s and detection of chemical at parts per billion or parts per trillion levels became routine, the question became if only one part per quadrillion of a given carcinogen is present in a food substance, such as food coloring, is that reason enough to ban the product? Under the Delaney Amendment, as interpreted by the court, the answer would be yes, and the food substance could not be sold for human consumption. But then, if such a tiny proportion of carcinogen were present, how harmful could it be?

To determine answers to these kinds of questions, scientists developed new methods of predicting adverse effects from low levels of exposure. The first paper on estimating risk from exposure to low doses of substances based on tests in which animals were exposed to high doses was published in 1961. As formal procedures for performing animal bioassays, originally used in qualitative risk assessment, were standardized in the 1960s and 1970s, regulatory agencies began performing risk assessments regularly in the 1970s.

In 1983, a National Research Council committee released Risk Assessment in the Federal Government: Managing the Process, which defined the steps in risk assessment and established a generally accepted nomenclature. Since the late 1980s risk assessment has been branching out from its original focus on carcinogens and now evaluates the effects on other systems, such as reproductive toxicity and birth defects arising from exposure to toxins in food and the environment.

On the 10-year anniversary of its first general publication in the field, the National Research Council released a second volume, Issues in Risk Assessment, evaluating two risk assessment practices, the use of the maximum tolerated dose in animal assays and the two-stage model of carcinogenesis, with an analysis of ecologic risk assessment. Reports are being prepared on exposure assessment and developmental toxicity.

Impetus for the OTA Report

Two congressional requests are responsible for the OTA study. In 1991, Congressmen John Dingell (D-Michigan), chair of the House Committee on Energy and Commerce, and George Brown (D-California), chair of the House Committee on Science, Space and Technology, asked OTA to prepare a report on government health risk assessment research. Later, in 1992, joined by Robert S. Walker (R-California) and J. Lewis (R-California), Brown requested an examination of EPA's approach to reducing exposure to radon in buildings. "The congressional requesters wanted the lay of the land," said Dalton G. Paxman, analyst for the OTA Biological and Behavioral Sciences Program and project director of the report.

A new decision-making paradigm. The OTA report suggests a mechanism for linking health risk research to decision-making. Office of Technology Assessment

In August 1991, as soon as the initial request was received, OTA began assembling a panel of 15 experts from nongovernmental organizations such as universities and those with a particular interest in the field, including industry and environmental groups. OTA staff met three times with the panel over the course of the report's preparation to identify issues and areas of consensus and disagreement. OTA also met individually with agency representatives who provided source materials and other contacts for the investigators. The panel reviewed and revised two drafts of the report before its final edition was agreed upon. In addition, the report was reviewed by more than 100 outside reviewers. OTA also convened 11 other experts, including those from government, for a one-day workshop focusing on research.

OTA is known for its "report cards" on research activities in and out of government; thus, federal agencies at first expected to be graded individually on their health risk assessment research activities. But as OTA interviewers contacted more and more scientists and administrators within the agencies, another picture emerged.

What the investigators found was that health risk assessment was not just an activity where certain agencies or departments were succeeding and others were failing. There were, rather, problems common to the field throughout government. "The scientists at the agencies felt there were problems, but couldn't put their finger on them exactly," Paxman said. "So now, health risk assessment research is looked at as an integrated federal effort. It's at the integrated overall level where the problems exist, and that's where the opportunities are as well."

105x135

Dalton Paxman--There should be a closer link between research and decision-making.

Congressional offices had heard that government wasn't using the best risk assessment science, but OTA found that the agencies "are doing good science--but there's more that goes into regulation than science," Paxman said, "There is a link between research and decision-making, and there should be a closer link. But there's a feeling that research would be poisoned by being more closely linked with regulation."

Management has been separated from risk assessment research in government because many believe that managers' expectations could skew the research outcome. Such separation of management and research was supported in the 1983 NAS report. The OTA report states that this gap must now be narrowed to unify risk assessment because, according to Sheila Jasanoff, professor and chair of the Department of Science and Technology Studies at Cornell University, risk assessment is "not a purely scientific activity."

"Indeed," Jasanoff wrote recently in the EPA Journal, "risk assessment is often described as an art rather than a science. This formulation emphasizes that risk assessment, like any artistic endeavor, requires the use of subjective judgment."

Impact of Risk Assessment Research

The OTA draft report acknowledges that risk assessment forms the basis of dozens of federal, state, and local laws and regulations, thus influencing the expenditure of hundreds of billions of dollars in health and environmental protection. "Policy makers depend on health risk assessment and research when making regulatory decisions about which risks to tolerate and which to reduce," the report says, pointing out that decisions to reduce risks may lead to vast expenditures for clean-up, while decisions to tolerate those risks may lead to vast health care and disability costs for those exposed.

The projected cost of complying with EPA regulations in fiscal year 1993, based on data obtained in 1987, is $146 billion to $154 billion. An estimate for the cost of compliance with Food and Drug Administration regulations is not available. However, major pharmaceutical manufacturers spent $9.2 billion in 1991 on research and development, some portion of which represents toxicity and safety testing to satisfy FDA requirements. Compliance costs are also incurred due to regulations from the Occupational Safety and Health Administration and the Consumer Product Safety Commission. In 1992, Congress appropriated more than $9 billion for environmental clean-up at federal facilities, particularly those operated by the Department of Energy and Department of Defense.

"The costs of some environmentally related illnesses are reasonably estimated to reach well into the billions of dollars, although there are no comprehensive estimates available," the OTA report notes. These include lead poisoning, pollution-related acute respiratory conditions, occupational diseases, and certain cancers. The portion of these costs that falls on federal programs such as Medicare, Medicaid, the Veterans Administration, and Social Security is estimated to be about 20% of the total. Yet, OTA says, "private sector costs generally dwarf public expenditures." The private-sector burden for environmentally related illness is about 70%, the report says. Other costs such as human suffering are impossible to value in dollars and cents.

OTA found that health risk assessment research does not typically fit into the sequential four-step process (hazard identification, exposure assessment, dose-response assessment, and risk characterization) outlined by the National Research Council. Instead, OTA classified health risk assessment research into three distinct categories: methodological research, or research to improve the method of risk assessment; basic research that may provide information to improve the foundations of risk assessment; and chemical-specific data development.

OTA characterized methodological research as devising new approaches for extrapolating results from animal models to human estimates of risk and extrapolating from high exposure levels to low exposures, for developing new assay systems, and for measuring uncertainty. This type of research is generic in the sense that its results can have a large impact on many assessments. Moreover, these models are directed at the most uncertain aspects of risk assessments, especially extrapolations from high to low dose, from animal models to humans, and for predicting toxicity of chemicals for which little or no toxicity data exist.

The OTA report separates basic research into two types: basic health risk research and basic sciences. Basic health risk research involves investigating disease mechanisms associated with exposure to toxic agents and examining the experimental tools for use in risk assessment research. Basic biological and biomedical sciences investigate the structure and function of molecules, cells, organs, physiological systems, and their relationship to the functioning organism.

Chemical-specific Data Development

Chemical-specific data development includes the execution of any or all of the steps in the National Research Council's paradigm. Hazard identification represents the broadest and most diverse category of data development research and involves testing agents relevant to the agencies' missions, as well as industry testing of potential commercial chemicals and substances already in use. The OTA report includes collection of data on exposure to environmental agents in this type of research.

Although some scientists dismiss "data collection" as less important than methodologic or basic research, OTA considers such work to be essential. Attention to accuracy is critical because exposure data and basic toxicologic information usually form the basis of agency rule-making. Data collection provides essential input for both research into risk assessment methods and basic research.

A look at the number of existing chemicals and new compounds added each year explains the need for futher toxicity testing and data development. OTA estimated that the total number of chemicals in commerce including industrial chemicals, pesticides, and food additives--all of which are potentially subject to regulation in the United States--is about 62,000, with an additional 1,500 developed each year.

Setting Priorities

Charting the course of risk assessment research requires work at several levels in federal agencies, and OTA examined the process at the national, agency, and program levels. Most importantly, OTA found that risk assessment research is not a national research priority. This is in spite of the fact that regulatory agencies are setting priorities and levels for cleanup at hazardous waste sites on the basis of risk assessment and that many national goals, especially public health and environmental protection, benefit from risk assessment research.

Regulatory stew. The OTA report suggests ingredients that should go into regulatory decision- making.Office of Technology Assessment

Some scientists interviewed for the report claim that the research system does not work. Resources, they argue, are squandered on a system that is incapable of setting priorities. Consequently, the perception exists that the areas of highest priority research, i.e., those most likely to improve the process of risk assessment, are not being funded or conducted.

OTA found that no process exists to set national and agency-wide research priorities for risk assessment research; "in fact, on a national level, no priority-setting mechanism appears to be in place for research generally, let alone risk assessment research specifically." "In contrast," said the report, "priority-setting at the program level appears comparatively formalized and well-directed in spite of limited discretionary budgets."

The federal research effort to improve risk assessment is largely decentralized, and though OTA said it observed a few multiagency efforts, participants and nonparticipants displayed little enthusiasm, and some even showed overt hostility toward the effort. Federal scientists conduct research almost entirely in support of their sponsoring agencies and departments, which is also the case for environmental research in general. Risk assessment research is spread across at least 12 different agencies and more than 28 programs. Each agency has its own set of priorities, based on different constituents, legislative mandates, and missions and influenced by historical factors. This makes agency-specific research easier, but it can also make work fragmented and diffuse.

OTA's examination of the resources allocated for research on risk assessment illuminates the breadth of the research. This breadth also contributes to the difficulty in defining health risk assessment research because both the accounting and activities are diffuse and usually overlap with other efforts. OTA estimated the resources for the entire federal effort at between $500 million and $600 million, with less than 20% being devoted to methodological research.

Decision-making

OTA also analyzed the interplay between research and decision-making. At its most basic level, the relationship between research and decision-making can be seen as a feedback loop: half of the loop is the impact of research on decision-making, the other half is the impact of decisions on research that needs to be done. This relationship provides a panoply of options for research priorities. In the current decision-making process, however, research identifies potential dangers, the public conveys its concerns to Congress, and Congress passes laws to address these concerns. According to the report, "This results in a reactive mode that may limit the capacity of agencies, such as EPA, to structure long-term efficient and effective solutions."

OTA admits that there are limits to the capacity of science to inform even on technical issues. "Well beyond more and better science, solving environmental risk assessment problems requires building trust between government, industry and citizens; it requires leadership in setting realistic goals and arranging collaborations of researchers from various disciplines and sectors of society," says the report. The validity of health risk assessments increases with the participation of scientists from many disciplines. Although disciplines such as the physical, biological, toxicological, and biomedical sciences provide the underpinning for assessing health risks, they alone are not sufficient for assessing human risks. Assumptions and policy positions, with embedded value judgments, are necessary to complete risk assessments. The selection of these assumptions would benefit from input from other disciplines, especially those that contribute to the development of laws and regulations. Sheila Jasanoff argues for "bridging the two cultures of risk analysis," which include quantitative sciences and nonquantitative sciences that are humanistic and culturally grounded, such as behavioral and political sciences.

Whatever is expected of it, risk assessment is only one element in formulating regulatory actions. Legislative mandates, social values, technical feasibility, economic factors, and the success or shortcomings of the research that feeds into risk assessment may take a more prominent role than expert predictions of risk. Scientific research can provide a more solid foundation for the decision maker in choosing among alternatives in risk management, but research alone will not necessarily steer decisions to control the most significant risks.

Structuring the Future

"Methods for identifying toxicants, exposed individuals and populations, models for inferring human health effects from animal studies, techniques for estimating risks and predicting health effects with few data are all in need of improvement or development," says the OTA report. To make these improvements, collaboration within and between federal agencies is needed. Such collaboration has taken place between government and universities, but not often between government and industry.

The report provides options for legislative action. The following actions were among those mentioned in the report:

-- Continue along the present path, as progress is being made at many agencies toward at least some of the goals the report mentions.

-- Launch a national initiative with the White House or executive-level leadership, which would raise the field to a higher level and priority. This would not only attract resources, but promote interagency and extramural cooperation.

-- Expand resources available for the field. Given the current fiscal atmosphere, Congress may not increase appropriations for health risk assessment research; therefore, funds could be raised by redirecting budget money already appropriated for other areas. For instance, Congress appropriated $9 billion during fiscal year 1993 to clean up hazardous waste sites at the Department of Energy and the Department of Defense; of that, a small percentage could be set aside for health assessment research to be applied directly to the clean-up effort.

-- Institute a system of fees and penalties: users of research could be assessed fees. A percentage of the money sent to the general fund from fines levied by EPA or OSHA could be diverted into a health risk assessment research fund.

-- Foster research in risk assessment methodology by establishing a center for health risk assessment methodology research. Alternatively, appoint a central coordinating body to provide leadership in conducting research on risk assessment methodology.

-- Encourage technology transfer in developing the field of risk assessment research, by earmarking funds for academic centers on risk assessment research; providing funds to the Department of Commerce to encourage transfer of technology that has commercial applications; encouraging more industry support of health risk assessment research; and by setting aside funds as incentives for collaboration.

Behind the Scenes

The staffers who contributed to the report tried to produce an objective document, and indeed, the tone of the text is measured and its conclusions balanced. Behind the document, however, lie a good deal of passion and opinion.

"I'm a devotee of risk assessment; I do it all the time," said Frank Young, former FDA commissioner. Yet, he acknowledges, "there is a need to analyze the various assumptions that go into the major procedures used in health risk assessment." Young, now director of the Office of Emergency Preparedness/National Defense Medical System, served in the OTA workshop where the report took shape.

105x135

Frank Young--We need to analyze risk assessment assumptions.FDA

One assumption of risk assessment involves laboratory animals such as white mice, which provide a considerable amount of data for risk assessment. The assumption holds that their diet makes no difference in the research outcome. Young disagrees, pointing out that recent work has shown that what laboratory animals eat can have a profound influence on results of risk assessment experiments. This may call into question all previous animal research.

The issue of how to extrapolate animal data to humans remains controversial. One way to clarify that question would be to conduct experiments using substances shown to cause cancer in humans. "No one has undertaken to understand animal responses to known human carcinogens" to see how the responses differ, said James D. Wilson, regulatory affairs director for Monsanto Company and an advisory panel member. "We need to understand that in detail, and we don't."

There is a need, according to John Vandenberg of the Health Effects Research Laboratory of EPA, to develop better models for testing because "we will never have human models" on which to experiment. Although there has been enormous progress in risk assessment, "no individual should put blind faith in the application of science and human reasoning. Risk assessment is an approximation of truth," says Young.

Nevertheless, FDA must perform risk assessment, which includes safety testing, to determine the safety and efficacy of drugs and devices. The regulation and testing of food, though on the surface a simpler task, is as complex as many drug issues. Although the Delaney Amendment prohibits adding carcinogenic substances to foods and food substances, research has since determined that many foods contain natural carcinogens, suggesting that there are factors in diet and human metabolism that protect against these substances. Other foods, sausages and other cured meats, for instance, containing sodium nitrites, are part of certain cultural heritages; thus, they remain on the market. "Some argue that if we were to ban carcinogens altogether, it would be difficult to put together an adequate diet," said Lester Lave, a professor of economics at Carnegie Mellon University who is former president of the Society for Risk Assessment.

Taking a harder line, Ellen Silbergeld of the Environmental Defense Fund calls health risk assessment as a way of dealing with hazards in the environment "a waste of time." To Silbergeld, a substance or a situation is either dangerous or it's not; there is no need to "blur the edges," as she claims risk assessment does. Industry, by demanding more precision, can succeed in postponing regulation. Further, she scoffs at the debate over whether results in animal experiments can be extrapolated to humans, believing a substance harmful to one mammal is likely to be harmful to others, period. "What we should be doing is looking more closely at the distribution of death and disease, and then asking what portion can be prevented," she says, and then making the appropriate policy decisions to make prevention possible. Vandenberg considers such arguments extreme, summing them up as, "If it's bad in humans, don't allow any exposure, or at least limit it. If you're going to limit it, what are you going to limit it to? That's what risk assessment is."

105x135

Ellen Silbergeld--Health risk assessment is a waste of time. Environmental Defense Fund

Erik Olson, senior attorney with the Natural Resources Defense Council, says the organization accepts that risk assessment is a part of society, but compares tinkering with risk assessment assumptions to "readjusting the chairs on the Titanic." Olson said, "The way to go is not to figure out whether 20 or 30 people are going to die of cancer from exposure to a substance and try to manage or reduce that risk, but to prevent the pollution in the first place."

Industry has argued, however, that decision makers must take into account everything that's known about health risks, and in many cases "there's more involved than a simplistic policy would admit," Wilson said. Anyone who has ever dealt with government bureaucracy would agree that it is far from simple and, as the OTA report mentions, has been a hindrance to good risk assessment. Scientists contacted tended to agree that, as Vandenberg put it, "within the federal government, you've got Balkanization of policy." However, there may be a reason for that which goes beyond protection of turf. "Agencies are all implementing different pieces of legislation," commented Bryan Hardin of NIOSH. "Each agency tends to have a different set of customers."

Some researchers, including Young and Wilson, point out that risk assessors must keep up with the latest science, and in government, they do not always do so. An example is formaldehyde. High concentrations of this substance cause tumors in rats; it would seem likely, then, that low exposure to formaldehyde among humans would cause enormous numbers of cancers. Yet that, says Wilson, has not happened. Statistical changes were created to make the rat model measure up more accurately to reality, but EPA scientists have been reluctant to modify their standard practice, even though EPA has advised them to do so whenever new information comes along that should be taken into consideration.

Whether the OTA report will result in reform is open to question, but Congress is already taking action in some areas that document mention. Senator Daniel Patrick Moynihan is sponsoring a bill that would set up two advisory committees whose duties would include ranking health and ecology risks, as well as an interagency panel to make federal risk assessment more consistent. It would also establish a research program to improve methodology research (see Spheres of Influence).

The question remains whether this, or any other reforms, will simplify risk assessment or make it more accurate. Some say that the more basic question is whether risk assessments should be used at all, or that the answer is simply to use fewer toxic chemicals. Regardless of the questions asked, it seems clear that in an increasingly hazardous world, the need to inject some measure of certainty into the outcomes of our actions will continue to fuel the drive for risk assessments.

Jan Ziegler
Jan Ziegler is a freelance writer in Washington, DC.

[Table of Contents] [Citation in PubMed] [Related Articles]

Last Update: August 21, 1998