CFSAN/Office of Nutrition, Labeling and Dietary Supplements
This guidance document is being distributed for comment purposes only
Comments and suggestions regarding this draft document should be submitted within 60 days of publication in the Federal Register of the notice announcing the availability of the draft guidance. Submit comments to Division of Dockets Management (HFA-305), Food and Drug Administration, 5630 Fishers Lane, rm. 1061, Rockville, MD 20852. All comments should be identified with the docket number listed in the notice of availability that publishes in the Federal Register.
For questions regarding this draft document, contact the Center for Food Safety and Applied Nutrition (CFSAN) at 301-436-2367* (* Updated phone: 301-436-2373).
U.S. Department of Health and Human Services
Food and Drug Administration
Center for Food Safety and Applied Nutrition
Contains Nonbinding Recommendations
This draft guidance, when finalized, will represent the Food and Drug Administration's (FDA's) current thinking on this topic. It does not create or confer any rights for or on any person and does not operate to bind FDA or the public. You may use an alternative approach if the approach satisfies the requirements of the applicable statutes and regulations. If you want to discuss an alternative approach, contact the FDA staff responsible for implementing this guidance. If you cannot identify the appropriate FDA staff, call the appropriate number listed on the title page of this guidance.
This draft guidance document represents the agency's current thinking on 1) the process for evaluating the scientific evidence for a health claim, 2) the meaning of the significant scientific agreement (SSA) standard in section 403(r)(3) of the Federal Food, Drug, and Cosmetic Act (the Act) (21 U.S.C. 343(r)(3)) and 21 CFR 101.14(c), and 3) credible scientific evidence to support a qualified health claim.
This guidance document describes the evidence-based review system that FDA intends to use to evaluate the publicly available scientific evidence for SSA health claims or qualified health claims on the relationship between a substance and a disease or health-related condition.(2) This guidance document explains the agency's current thinking on the scientific review approach FDA should use and is intended to provide guidance to health claim petitioners.(3)
The specific topics addressed in this guidance document are: (1) identifying studies that evaluate the substance/disease relationship, (2) identifying surrogate endpoints for disease risk, (3) evaluating the human studies to determine whether scientific conclusions can be drawn from them about the substance/disease relationship, (4) assessing the methodological quality of each human study from which scientific conclusions about the substance/disease relationship can be drawn, and (5) evaluating the totality of scientific evidence.
FDA's guidance documents, including this guidance, do not establish legally enforceable responsibilities. Instead, guidances describe the agency's current thinking on a topic and should be viewed only as recommendations, unless specific regulatory or statutory requirements are cited. The use of the word should in the agency's guidances means that something is suggested or recommended, but not required.
The Nutrition Labeling and Education Act of 1990 (NLEA) (Pub. L. 101-553) was designed to give consumers more scientifically valid information about foods they eat. Among other provisions, the NLEA directed FDA to issue regulations providing for the use of statements that describe the relationship between a substance and a disease ("health claims") in the labeling of foods, including dietary supplements, after such statements have been reviewed and authorized by FDA.(4) For these health claims, that is, statements about substance/disease relationships, FDA has defined the term "substance" by regulation as a specific food or food component (21 CFR 101.14(a)(2)). An authorized health claim may be used on both conventional foods and dietary supplements, provided that the substance in the product and the product itself meet the appropriate standards in the authorizing regulation. Health claims are directed to the general population or designated subgroups (e.g., the elderly) and are intended to assist the consumer in maintaining healthful dietary practices.
In evaluating a petition for an authorized health claim, FDA considers whether the evidence supporting the relationship that is the subject of the claim meets the SSA standard. This standard derives from 21 U.S.C. 343 (r)(3)(B)(i), which provides that FDA shall authorize a health claim to be used on conventional foods if the agency "determines based on the totality of the publicly available evidence (including evidence from well-designed studies conducted in a manner which is consistent with generally recognized scientific procedures and principles), that there is significant scientific agreement among experts qualified by scientific training and experience to evaluate such claims, that the claim is supported by such evidence." This scientific standard was prescribed by statute for conventional food health claims; by regulation, FDA adopted the same standard for dietary supplements health claims. See 21 CFR 101.14(c).
The genesis of qualified health claims was the court of appeals decision in Pearson v. Shalala (Pearson). In that case, the plaintiffs challenged FDA's decision not to authorize health claims for four specific substance-disease relationships in the labeling of dietary supplements. Although the district court ruled for FDA (14 F. Supp. 2d 10 (D.D.C. 1998), the U.S. Court of Appeals for the D.C. Circuit reversed the lower court's decision (164 F.3d 650 (D.C. Cir.1999)). The appeals court held that the First Amendment does not permit FDA to reject health claims that the agency determines to be potentially misleading unless the agency also reasonably determines that a disclaimer would not eliminate the potential deception. The appeals court also held that the Administrative Procedure Act (APA) required FDA to clarify the "significant scientific agreement" (SSA) standard for authorizing health claims.
On December 22, 1999, FDA announced the issuance of its Guidance for Industry: Significant Scientific Agreement in the Review of Health Claims for Conventional Foods and Dietary Supplements (64 Fed. Reg.17494). This guidance document was issued to clarify FDA's interpretation of the SSA standard in response to the court of appeals' second holding in Pearson.
On December 20, 2002, the agency announced its intention to extend its approach to implementing the Pearson decision to include health claims for conventional foods (67 Fed. Reg. 78002). Recognizing the need for a scientific framework for qualified health claims, the Task Force on "Consumer Health Information for Better Nutrition" was formed. The Task Force recognized that there could be significant public health benefits when consumers have access to, and use, more and better information in conventional food as well as dietary supplement labeling to aid them in their purchases, information that goes beyond just price, convenience, and taste, but extends to include science-based health factors. Armed with more scientifically based information about the likely health benefits of the foods and dietary supplements they purchase, consumers can make a tangible difference in their own long-term health by lowering their risk of numerous chronic diseases.
To maximize the public health benefit of FDA's claims review process, the Task Force's Final Report(5) provides a procedure to prioritize on a case-by-case basis all complete petitions according to several factors, including whether the food or dietary supplement that is the subject of the petition is likely to have a significant impact on a serious or life-threatening illness; the strength of the evidence; whether consumer research has been provided to show the claim is not misleading; whether the substance of the claim has undergone an FDA safety review (i.e., is an authorized food additive, has been Generally Recognized as Safe (GRAS) affirmed, listed, or has received a letter of "no objection" to a GRAS notification); whether the substance that is the subject of the claim has been adequately characterized so that the relevance of available studies can be evaluated; whether the disease is defined and evaluated in accordance with generally accepted criteria established by a recognized body of qualified experts; and whether there is prior review of the evidence or the claim by a recognized body of qualified experts.
As part of the Task Force's final report(6), FDA developed an interim evidence-based review system that the agency intended to use to evaluate the substance/disease relationships that are subjects of qualified health claims. In reviewing the December 22, 1999 SSA guidance document and the 2003 Task Force report, it became apparent to the agency that the components of the scientific review process for an SSA health claim and qualified health claim are very similar. Because of the similarity between the scientific reviews for SSA and qualified health claims, FDA intends to use the approach set out in this guidance for evaluating the scientific evidence in petitions that are submitted for an SSA health claim or qualified health claim. The evidence-based review system set out in this guidance will assist the agency in determining whether the scientific evidence meets the SSA standard or, if not, whether the evidence supports a qualified health claim. In addition to a science review, health claims undergo a regulatory review. Health claims that meet the SSA standard are authorized by publication of a final rule or an interim final rule in the Federal Register. For qualified health claims supported by credible evidence, FDA issues a letter regarding its intent to consider enforcement discretion.
An evidence-based review system is a systematic science-based evaluation of the strength of the evidence to support a statement. In the case of health claims, it evaluates the strength of the scientific evidence to support a proposed claim about a substance/disease relationship. The evaluation process involves a series of steps to assess scientific studies and other data, eliminate those from which no conclusions about the substance/disease relationship can be drawn, rate the remaining studies for methodological quality and evaluate the strength of the totality of scientific evidence by considering study types, methodological quality, quantity of evidence for and against the claim (taking into account the numbers of various types of studies and study sample sizes), relevance to the U.S. population or target subgroup, replication of study results supporting the proposed claim, and overall consistency of the evidence. After assessing the totality of the scientific evidence, FDA determines whether there is SSA to support an authorized health claim, or credible evidence to support a qualified health claim.
The agency considers the publicly available data and written information pertaining to the relationship between a substance and disease. Before the strength of the evidence for a substance/disease relationship can be assessed, FDA separates individual relevant articles on human studies from other types of data and information. FDA intends to focus its review primarily on articles reporting human intervention and observational studies because only such studies can provide evidence from which scientific conclusions can be drawn about the substance/disease relationship in humans. Next, the agency considers a number of threshold questions in the review of the scientific evidence:
Studies should identify a substance that is measurable. A "substance" is defined as a specific food or component of food regardless of whether the food is in conventional food form or a dietary supplement. 21 CFR 101.14(a) (2). A food component can be, for example, a nutrient or dietary ingredient.(7) If the substance is to be consumed as a component of conventional food at decreased dietary levels, the substance must be a nutrient that is required to be included in the Nutrition Facts label (21 CFR 101.14(b)(2)). If the substance is to be consumed at other than decreased dietary levels, the substance must contribute taste, aroma, nutritive value,(8) or a technical effect listed in 21 CFR 170.3(o) to the food, and must be safe and lawful for use at the levels necessary to justify a claim (21 CFR 101.14(b)(3)).
"Disease or health-related condition" is defined as damage to an organ, part, structure, or system of the body such that it does not function properly (e.g., cardiovascular disease), or a state of health leading to such dysfunctioning (e.g., hypertension). 21 CFR 101.14(a) (5). Studies should identify a measurable disease or health-related condition by either measuring incidence, associated mortality, or validated surrogate endpoints that predict risk of a specific disease.
After considering these threshold issues, FDA categorizes the studies by type.
In an intervention study, subjects are provided the substance (food or food component) of interest (intervention group), typically either in the form of a conventional food or dietary supplement. The quality and quantity of the substance should be controlled for. In randomized controlled trials, subjects are assigned to an intervention group by chance rather than systematically. Individual subjects may not be similar to each other, but the intervention and control groups should be similar after randomization. Randomized controlled trials offer the best assessment of a causal relationship between a substance and a disease because they control for known confounders of results (i.e., other factors that could affect risk of disease). Through random assignment of subjects to the intervention and control groups, these studies avoid selection bias -- that is, the possibility that those subjects most likely to have a favorable outcome, independent of an intervention, are preferentially selected to receive the intervention. Potential bias is also reduced by "blinding" the study so that the subjects do not know whether they are receiving the intervention, or "double blinding," in which neither the subjects nor the researcher who assesses the outcome knows who is in the intervention group and who is in the control group. By controlling the test environment, including the amount and composition of substance consumed and all other dietary factors, these studies also can minimize the effects of variables or confounders on the results.(9) Therefore, randomized, controlled intervention studies provide the strongest evidence of whether or not there is a relationship between a substance and a disease (Greer et al., 2000).
Furthermore, such studies can provide convincing evidence of a cause and effect relationship between an intervention and an outcome (Kraemer et al., 2005 at 113). Randomization, however, may result in unequal distribution of the characteristic of the subjects between the control and treatment groups (e.g., baseline age or blood [serum or plasma] LDL cholesterol levels are significantly different). If the baseline values are significantly different, then it is difficult to determine if differences at the end of the study were due to the intervention or to differences at the beginning of the study. When the substance is provided as a supplement, a placebo should be provided to the control group. When the substance is a food, it may not be possible to provide a placebo and therefore subjects in such a study may not be blinded. Although the study may not be blinded in this case, a control group is still needed to draw conclusions from the study.
Randomized controlled trials typically have either a parallel or cross-over design. Parallel design studies involve two groups of subjects, the test group and the control group, which simultaneously receive the substance or serve as the control, respectively. Cross-over design involves all subjects crossing over from the intervention group to the control group and vice versa, after a defined time period.
Although intervention studies are the most reliable category of studies for determining a cause-and effect relationship, generalizing from the studies conducted on selected populations to different populations may not be scientifically valid. For example, if the evidence consists of studies showing an association between intake of a substance and reduced risk of juvenile diabetes, then such studies should not be extrapolated to the risk of diabetes in adults.
Observational studies measure associations between the substance and disease. Observational studies lack the controlled setting of intervention studies. Observational studies are most reflective of free-living(10) populations and may be able to establish an association between the substance and the disease. In contrast to intervention studies, observational studies cannot determine whether an observed relationship represents a relationship in which the substance caused a reduction in disease risk or is a coincidence (Sempos et al., 1999). Because the subjects are not randomized based on various disease risk factors at the beginning of the study, known confounders of disease risk need to be collected and adjusted for to minimize bias. For example, information on each subject's risk factors, such as age, race, body weight and smoking, should be collected and used to adjust the data so that the substance/disease relationship is accurately measured. Risk factors that need to be adjusted for are determined for each disease being studied. For example, the risk of cardiovascular disease increases with age, therefore an adjustment for age is needed in order to eliminate potential confounding.
In determining whether a substance that is the subject of the claim has been measured appropriately, it is important to critically evaluate the method of assessment of dietary intake. Many observational studies rely on self-reports of diet (e.g., diet records, 24-hour recalls, diet histories, and food frequency questionnaires), which are estimates of food intake (National Research Council, 1989). Diet records are based on the premise that food weights provide an accurate estimation of food intake. Subjects weigh the foods they consume and record those values. The 24-hour recall method requires that subjects describe which foods and how much of each food they consumed during the prior 24-hour period. Diet histories use questionnaires or interviewers to estimate the typical diet of subjects over a certain period of time. A food frequency questionnaire is the most common dietary assessment tool used in large observational studies of diet and health. Validated food frequency questionnaires are more reliable in estimating "usual" intake of foods compared to diet records or 24-hour recall methods (Subar et al., 2001). The questionnaire asks participants to report the frequency of consumption and portion size from a list of foods over a defined period of time. One problem with the dietary intake assessment methods described above is that there may be bias in the self-reporting of certain foods. For example, individuals who are overweight tend to under-report their portion sizes (Flegal et al., 1999) and therefore the actual amount of substances consumed is often underestimated. If there are reliable biomarkers of intake(11) of a substance, these biomarkers are often measured rather than using self-reported intakes.
Observational studies may be prospective or retrospective. In prospective studies, investigators recruit subjects and observe them prior to the occurrence of the disease outcome. Prospective observational studies compare the incidence of a disease with exposure to the substance. In retrospective studies, investigators review the medical records of subjects and/or interview subjects after the disease has occurred. Retrospective studies are particularly vulnerable to measurement error and recall bias because they rely on subjects' recollections of what they consumed in the past. Because of the limited ability of observational studies to control for variables, they are often susceptible to confounders, such as complex substance/disease interactions.
Well-designed observational studies can provide useful information for identifying possible associations to be tested by intervention studies, but they cannot provide convincing evidence of cause and effect (Kraemer et al., 2005 at 107). In contrast to intervention studies, even the best-designed observational studies cannot establish cause and effect between an intervention and an outcome (Kraemer et al, 2005 at 114). However, as discussed above, intervention studies can test whether there is evidence to show a cause and effect between a substance and a reduced risk of a disease. Observational studies from which scientific conclusions can be drawn, in some situations, can be support for a substance/disease relationship for an SSA or qualified health claim. Each observational study design has its strength and weaknesses as discussed below (Sempos et al., 1999).
Cohort studies are prospective studies that compare the incidence of a disease in subjects who receive a specific exposure of the substance that is the subject of the claim with the outcome of subjects who do not receive that exposure. Because the intake of the substance precedes disease development, this study design ensures that the subjects are not consuming the substance in response to having the disease. Cohort studies can yield relative estimates of risk (Szklo and Nieto, 2000).(12) Cohort studies are considered to be the most reliable observational study design (Greer et al., 2000).
In case-control studies, subjects with a disease (cases) are compared to subjects who do not have the disease (controls).(13) Prior intake of the substance is estimated from dietary assessment methods for both cases and control. These retrospective studies often ask about food consumption at least 1 year prior to diagnosis of the disease, making it difficult to obtain an accurate estimate of intake. Furthermore, a key assumption is that food consumption has not been altered by the disease process or by knowledge of having the disease. Thus, the case-control study design does not control for changes in intake caused by or in response to the disease. Case-control studies can yield an odds ratio which is an estimate of the relative risk of getting the disease (Szklo and Nieto, 2000).(14) Case-control studies are considered to be less reliable than cohort studies (Greer et al., 2000).
A nested-case control or case-cohort study uses subjects from a pre-defined cohort, such as the population of an ongoing cohort study. Cases are subjects diagnosed with the disease (e.g., lung cancer) in the cohort. In a nested-case control study, controls are subjects selected from individuals at risk each time a case (e.g., lung cancer) is diagnosed. In a case-cohort study, controls are selected randomly from the baseline cohort. (Szklo and Nieto, 2000 PAGE 34). Either a relative risk or odds ratio may be calculated in these types of studies. Nested-case control or case-cohort studies are considered less reliable than cohort studies but more reliable than case-control studies.
Cross-sectional studies usually involve collecting information on food consumption at a single point in time in individuals with and without a specific disease.(15) These studies can be useful for identifying possible correlates and can be useful for providing baseline information for subsequent prospective studies (Kraemer et al., 2005 at 99-100). However, because dietary intake and disease status are measured at the same time, it is not possible to determine whether dietary intake of the substance is a factor affecting the risk of the disease or a result of having the disease. Therefore, cross-sectional studies cannot show a cause and effect relationship. Cross-sectional studies calculate the prevalence of a disease based on exposure and this may be a measure of survival of the disease rather than the risk of developing the disease (Szklo and Nieto, 2000). Further, cross-sectional studies are considered to be a "relatively weak method of studying diet-disease associations" because they can be subject to significant potential measurement error regarding dietary intake due to inaccuracy of survey methods used and limited ability to control for dietary intake variations (Sempos et al., 1999). For these reasons, cross-sectional study results "have the potential to mislead as errors of interpretation are very common" (Kraemer et al., 1005 at 103). Cross-sectional studies are considered to be less reliable than cohort and case-control studies (Greer et al., 2000).
Ecological studies compare disease incidence across different populations. Case reports describe observations of a single subject or a small number of subjects. Ecological studies and case reports are the least reliable types of observational studies.
Reports that discuss a number of different studies, such as review articles,(16) do not provide sufficient information on the individual studies reviewed for FDA to determine critical elements such as the study population characteristics and the composition of the products used. Similarly, the lack of detailed information on studies summarized in review articles prevents FDA from determining whether the studies are flawed in critical elements such as design, conduct of studies, and data analysis. FDA must be able to review the critical elements of a study to determine whether any scientific conclusions can be drawn from it. Therefore, FDA intends to use review articles and similar publications(17) to identify reports of additional studies that may be useful to the health claim review and as background about the substance/disease relationship. If additional studies are identified, the agency intends to evaluate them individually. Most meta-analyses,(18) because they lack detailed information on the studies summarized, will only be used to identify reports of additional studies that may be useful to the health claim review and as background about the substance-disease relationship. FDA, however, intends to consider as part of its health claim review process a meta-analysis that reviews all the publicly available studies on the substance/disease relationship. The reviewed studies should be consistent with the critical elements, quality and other factors set out in this guidance and the statistical analyses adequately conducted.
FDA intends to use animal and in vitro studies as background information regarding mechanisms that might be involved in any relationship between the substance and disease. The physiology of animals is different than that of humans. In vitro studies are conducted in an artificial environment and cannot account for a multitude of normal physiological processes such as digestion, absorption, distribution, and metabolism that affect how humans respond to the consumption of foods and dietary substances (IOM, 2005). Animal and in vitro studies can be used to generate hypotheses or to explore a mechanism of action of a specific food component through controlled animal diets, but do not provide information from which scientific conclusions can be drawn regarding a relationship between the substance and disease in humans.
Surrogate endpoints are risk biomarkers(19) that have been shown to be valid predictors of disease risk and therefore may be used in place of clinical measurements of the incidence of the disease (e.g., diagnosis of disease) (Spilker et al. 1991). Because a number of diseases develop over a long period of time, it may not be possible to carry out the study for a long enough period to see a statistically meaningful difference in the incidence of disease among study subjects in the treatment and control groups.
These are examples of surrogate endpoints of disease risk accepted by the National Institutes of Health and/or FDA's Center for Drug Evaluation and Research: (1) serum low-density lipoprotein (LDL) cholesterol concentration, total serum cholesterol concentration, and blood pressure for cardiovascular disease; (2) bone mineral density for osteoporosis; (3) adenomatous colon polyps for colon cancer; and (4) elevated blood sugar concentrations and insulin resistance for type 2 diabetes.
There can be multiple pathways to a specific disease, such as cardiovascular disease. Therefore, the accepted surrogate endpoints that are involved in a single pathway may not be applicable to certain substances that are involved in a different pathway. For example, the long chain omega-3 fatty acids generally have no effect on serum LDL cholesterol levels and studies suggest that these fatty acids alter cardiovascular risk through a different pathway. Therefore, LDL cholesterol levels cannot be used in evaluating the relationship between the long chain omega-3 fatty acids and risk of cardiovascular disease.
Under the evidence-based review approach set out in this guidance, FDA intends to evaluate each individual human study to determine whether any scientific conclusions about the substance/disease relationship can be drawn from the study. Certain critical elements of a study, such as design, data collection, and data analysis, may be so seriously flawed that they make it impossible to draw scientific conclusions from the study. FDA does not intend to use studies from which it cannot draw any scientific conclusions about the substance/disease relationship, and plans to eliminate such studies from further review. Below are examples of questions that the agency intends to consider to determine whether scientific conclusions can be drawn from an intervention or observational study about the substance/disease relationship.
Health claims involve reducing the risk of a disease in people who do not have the disease that is the subject of the claim. FDA considers evidence from studies with subjects who have the disease that is the subject of the claims only if it is scientifically appropriate to extrapolate to individuals who do not have the disease. That is, the available scientific evidence demonstrates that (1) the mechanism(s) for the mitigation or treatment effects measured in the diseased populations are the same as the mechanism(s) for risk reduction effects in non-diseased populations and (2) the substance affects these mechanisms in the same way in both diseased and healthy people. If such evidence is not available, the agency cannot draw any scientific conclusions from studies that used subjects that have the disease that is subject of the health claim to evaluate the substance/disease relationship and, therefore, the agency does not intend to use these studies to evaluate the substance/disease relationship. On the other hand, if, for example, FDA was reviewing a health claim on reduction of risk of coronary heart disease, it would consider studies that include individuals who have an unrelated disease (e.g., osteoporosis) or are at risk (e.g., elevated LDL cholesterol levels) of getting the disease that is the subject of the claim.
An appropriate control group represents study subjects who did not receive the substance. If an appropriate control group is not included, then it is not possible to ascertain whether changes in the endpoint of interest were due to the substance or due to unrelated and uncontrolled extraneous factors (Spilker et al., 1991; Federal Judicial Center, 2000). Without an appropriate control group, scientific conclusions cannot be drawn about a substance/disease relationship and, therefore, the agency does not intend to use these studies to evaluate the substance/disease relationship.
When the intervention study involves providing a whole food rather than a food component, the experimental and control diets should be similar enough that a relationship between the substance and disease can be evaluated. For example, if the substance is a specific type of fatty acid, then the composition of the experimental and control diets should be similar for all food components, except that particular fatty acid. Scientific conclusions cannot be drawn about the relationship between a substance and a disease when the amounts of other substances that are known to affect the risk of the disease that is subject of the claim are different between the control and experimental diet.
When the substance is a food component, it may not be possible to accurately determine its independent effects, when whole foods or multi-nutrient supplements are provided to the intervention group. For example, if the claim was about a relationship between lutein and age-related macular degeneration (AMD), then scientific conclusions cannot be drawn about a study that provided spinach or multi-nutrient supplements that contain other substances (e.g., vitamin C, vitamin E, and zinc) that have been suggested to have a role in protecting against AMD.
If the baseline values for the endpoint being measured are significantly different, then it is difficult to interpret the findings of the intervention. For example, in a study of the effects of a low-sodium diet on the risk of cardiovascular disease, having baseline blood pressure levels higher in the intervention group than in the control group would lead to uncertainty as to whether any observed effect resulted from the difference in the sodium intake between the two groups. Providing a "lead-in"(20) diet or a "wash-out" period(21) for studies with a cross-over design for an adequate duration prior to randomization can help reduce the likelihood of different baseline values.
Statistical analysis of the study data is a critical factor because it provides the comparison between subjects consuming the substance and those not consuming the substance, to determine whether there is a reduction in risk of the disease. Furthermore, when conducting statistical analyses among more than two groups, the data should be analyzed by a test designed for multiple comparisons (e.g., Bonferroni, Duncan). Thus when statistical analyses are not performed between the control and intervention group or are conducted inappropriately, scientific conclusions cannot be drawn about the role of the substance in reducing the risk of the disease and, therefore, the agency does not intend to use such studies to evaluate the substance/disease relationship.
As discussed above, when the study does not measure disease incidence or associated mortality, then surrogate endpoints are essential for measuring risk. Scientific conclusions cannot be drawn about the relationship between the substance and risk of the disease if the risk biomarker is not a surrogate endpoint (see discussion above in Section III.C). The agency does not intend to use such studies from which scientific conclusions cannot be drawn in its evaluation of the substance/disease relationship.
Studies that use a surrogate endpoint should be conducted long enough to ensure that any change in the endpoint is in response to the dietary intervention. If the study is run for a short time period such that the effects of the substance cannot be evaluated, then scientific conclusions cannot be drawn about the relationship between the substance and the disease and, therefore, the agency does not intend to use such a study to evaluate the substance/disease relationship. For example, FDA has considered 3 weeks to be the minimum duration for evaluating the effect of an intervention with various saturated fats on serum LDL cholesterol concentration (Kris-Etherton and Dietschy, 1997)
When the dietary intervention involves dietary advice rather than a prescribed diet administered under a controlled condition, there should be some type of assessment of the changes in intake of the substance (e.g., dietary assessment or measurement of a biological sample in response to dietary advice). Without some type of assessment of whether the dietary advice resulted in a change in intake of the substance, scientific conclusions cannot be drawn about the substance/disease relationship and, therefore, the agency does not intend to use studies that lack such an assessment to evaluate the substance/disease relationship.
It is important that the study population is relevant to the general U.S. population or the population subgroup identified in the proposed claim. Thus, FDA evaluates each study to determine if the study population lives in an area where malnutrition or inadequate intakes of the specific substance is common, and/or where the prevalence or etiology of the disease that is the subject of the claim is not similar to that in the United States. For certain countries, there may be risk factors of a specific disease that are not relevant to disease risk in the United States (e.g., risk factors for gastric cancer in certain Asian countries). Differences in nutrition, diet, and disease risk factors between the United States and the country where a study was done may mean that the study results cannot be extrapolated to the U.S population or population subgroup. For example, scientific conclusions about the comparatively well-nourished U.S. population cannot be drawn from studies in subjects that are malnourished. Nutrient status and metabolism can be severely altered when an individual is malnourished and therefore the effect of the substance on a particular surrogate endpoint may be very different between a malnourished and well-nourished individual (Shils et al., 2006).
Scientific conclusions cannot be drawn from studies conducted in countries or regions where inadequate intake of the substance is common since a response to the intake of the substance may be due to the correction of a nutrient deficiency for which health claims are not intended. Furthermore, conclusions cannot be drawn from studies conducted in countries or regions where the etiology of the disease is very different than in the United States. For example, major risk factors for gastric cancer in Japan (high salt intake and Helicobacter pylori (H. pylori) infection) are significantly more prevalent than in the United States. Therefore, it is not appropriate to extrapolate from data on a Japanese population concerning the relationship between a substance and gastric cancer to reach conclusions about potential effects on the U.S. population.
Biological samples (e.g., blood, urine, tissue, or hair) should be used to establish intake of a substance only if a dose-response relationship has been demonstrated between intake of the substance and the level of the substance (or a metabolite of the substance) in the biological sample. For example, there should be evidence to demonstrate a strong correlation(22) between the intake level of the substance and the level of the substance or a metabolite in the biological sample (e.g., selenium intake and nail selenium concentration). If the correlation is weak for a specific biological sample, then scientific conclusions cannot be drawn from studies that used that biological sample as a biomarker of intake. Biological samples in case-control studies should not be used to establish intake of the substance since the metabolism or concentration of the substance may be altered in subjects with the disease (cases).
A single 24-hour recall is generally regarded as an inadequate method for assessing an individual's usual intake of a substance, although it may be useful for assessing mean intake of a group. A diet history involves extensive interviews with the study subjects. A food frequency questionnaire contains a limited number of food items and is inadequate for assessing intake of a specific food if the major sources of the food are not included in the questionnaire. Food frequency questionnaires also do not always account for different varieties of a particular food or different cooking methods. Because of these limitations, validation of the food frequency questionnaire method to assess food intake is essential in order to be able to draw conclusions from the scientific data, as the failure to validate may lead to false associations between dietary factors and diseases or disease-related markers.(23)
Because observational studies estimate intake of a whole food based on recorded dietary intake methods such as food frequency questionnaires, diet recalls, or diet records, a common weakness of observational studies is the limited ability to ascertain the actual intake of the substance for the population studied. Furthermore, if the substance is a food component rather than a whole food, there is an additional estimation of the amount of the food component that is present in the individual foods. The content of foods' components can vary based on factors such as soil composition, food processing/cooking procedures, or storage (duration, temperature). Thus, it is difficult to ascertain an accurate amount of the food component consumed based on reports of dietary intake of whole foods.
In addition, the whole food and products that include several food components, e.g., multi-nutrient dietary supplements, contain not only the food component that is the subject of the claim, but also other food components that may be associated with the metabolism of the food component of interest or the pathogenesis of disease or health-related condition. Because whole foods and products such as multi-nutrient dietary supplements consist of many food components, it is difficult to study the food components in isolation (Sempos et al., 1999). For studies based on recorded dietary intake of whole foods or multiple food components, it is not possible to accurately determine whether any observed effects of the food component that is the subject of the claim on disease risk were due to: (1) that food component alone; (2) interactions with other food components; (3) other food components acting alone or together; or (4) decreased consumption of other substances contained in foods displaced from the diet by the increased intake of foods rich in the food component of interest (See Sempos et al. (1999), Willett (1990) and Willett (1998) regarding the complexity of identifying the relationship between a specific food component within a food and a disease).
In fact, evidence demonstrates that in a number of instances, observational studies based on the recorded dietary intake of conventional foods may indicate a benefit for a particular nutrient with respect to a disease but it is subsequently demonstrated in an intervention study that the nutrient-containing dietary supplement does not confer a benefit or actually increases risk of the disease (Lichtenstein and Russell, 2005). For example, previous observational studies reported an association between fruits and vegetables high in beta-carotene and a reduced risk of lung cancer (Peto et al., 1981). However, subsequent intervention studies, the Alpha-Tocopherol and Beta Carotene Prevention Study (ATBC) and the Carotene and Retinol Efficiency Trial (CARET), demonstrated that beta-carotene supplements increase the risk of lung cancer in smokers and asbestos-exposed workers, respectively (The Alpha-Tocopherol and Beta Carotene Cancer Prevention Study Group, 1994; Omenn et al., 1996). These studies illustrate that the effect of a nutrient provided as a dietary supplement exhibits different health effects compared to when it is consumed among many other food components. Furthermore, these studies demonstrate the potential public health risk of relying on results from epidemiological studies, in which the effect of a nutrient is based on recorded dietary intake of conventional foods as the sole source for concluding that a relationship exists between a specific nutrient and disease risk; the effect could actually be harmful. For the above reasons, scientific conclusions from observational studies cannot be drawn about a relationship between a food component and a disease. Observational studies, however, can be used to measure associations between a whole food and a disease.(24)
For the studies that are not eliminated during the earlier evaluation, FDA intends to independently rate each such study for methodological quality. Studies can receive a high, moderate, or low quality rating. FDA intends to base this quality rating on several factors related to study design, data collection, the quality of the statistical analysis, the type of outcome measured, and study population characteristics other than relevance to the U.S. population (e.g., selection bias and the provision of important subject information [e.g., age, smokers]). If the scientific study adequately addressed all or most of the above factors, FDA plans to give it a high methodological quality rating. FDA plans to give moderate or low quality ratings based on the extent of the deficiencies or uncertainties in the quality factors. Studies that are so deficient in quality that they receive a low quality rating are studies from which scientific conclusions cannot be drawn about the substance/disease relationship and are eliminated for further review.
Examples of factors FDA intends to consider in assessing the methodological quality of individual studies remaining at this point in the scientific evaluation approach set out in this guidance include the following:
Appropriate randomization eliminates intrinsic and/or extrinsic factors, other than the substance, that could have an influence on the outcome of the study. Blinding is especially important when the endpoint can be influenced by a subject's awareness that he or she is receiving something that may be beneficial. Blinding would be critical when the outcome measure was on cognitive performance, mental status (e.g. memory, depression), or behavior. Including a placebo in a supplementation trial prevents the subject from knowing whether he or she is receiving the substance or not.
For instance, were healthy or high-risk subjects allowed to take medications that can affect the disease that is subject of the claim during the study? If so, was the proportion of subjects taking medications similar between the control and intervention groups?
If there were a marked number of drop-outs, then it would be important to know why subjects dropped out and how the drop-outs affected the number and composition of the intervention and placebo group.
Several aspects of a substance/disease relationship may give rise to confounders. Therefore, it is important to adjust for confounders of the disease of interest so that observed effects on risk of disease that may be due to confounders are not incorrectly attributed to the substance of interest. For example, there can be multiple non-dietary risk factors for a disease (e.g., smoking, body mass index, and age for hypertension). Therefore, when evaluating the relationship between sodium and blood pressure, an adjustment of the risk analysis should be made based on age, smoking, body mass index and age.
Validated food frequency questionnaires are more reliable in estimating "usual" intake of foods compared to diet records or 24-hour recall methods. See Section III.B.
Under the approach set out in this guidance, at this point, FDA intends to evaluate the results of the studies from which scientific conclusions can be drawn and rate the strength of the total body of publicly available evidence. The agency plans to conduct this evaluation by considering the study type (e.g., intervention, prospective cohort, case-control, cross-sectional), methodological quality rating previously assigned, quantity of evidence (number of the various types of studies and sample sizes), relevance of the body of scientific evidence to the U.S. population or target subgroup, whether study results supporting the proposed claim have been replicated(25), and the overall consistency(26),(27) of the total body of evidence. Based on the totality of the scientific evidence, FDA determines whether such evidence meets the SSA standard or whether such evidence is credible to support a qualified health claim for the substance/disease relationship.
Within each study type, the studies are reviewed for:
FDA evaluates whether the totality of the evidence supports a claims for the entire U.S. population or just a subgroup. If the evidence only supports a claim for a subgroup, that information would be set out in the claim. If the substance is one that must be used for risk reduction at much higher levels than the normal U.S. intake, that information would also be reflected in the claim
In general, intervention studies provide the strongest evidence for an effect, regardless of existing observational studies on the same relationship. Intervention studies are designed to avoid selection bias and avoid findings that are due to chance or other confounders of disease (Sempos et al., 1999). Although the evaluation of substance/disease relationships often involves both intervention and observational studies, observational studies generally can not be used to rule out the findings from more reliable intervention studies (Sempos et al., 1999). Thus, when randomized, controlled intervention studies are consistent in showing or not showing a substance/disease relationship, they trump the findings of any number of observational studies (Barton, 2005). This is because such studies are designed and controlled to test whether there is evidence of a cause and effect relationship between the substance and the reduced risk of a disease, whereas observational studies are only able to identify possible associations. There are numerous examples -- such as vitamin E and CVD, folate and CVD, and beta-carotene and lung cancer -- where associations identified in observational studies have been publicized. However, when randomized, controlled intervention studies were later conducted to test these possible associations, the intervention studies found no evidence to support the relationships (Lichtenstein and Russell, 2005).
Health claims represent a continuum of scientific evidence that extends from very limited or inconclusive evidence to consensus, with evidence supporting SSA health claims lying closer to consensus. FDA's determination of SSA represents the agency's best judgment as to whether qualified experts would likely agree that the scientific evidence supports the substance/disease relationship that is the subject of a proposed health claim. The SSA standard is intended to be a strong standard that provides a high level of confidence in the validity of the substance/disease relationship. SSA means that the validity of the relationship is not likely to be reversed by new and evolving science. The assessment of SSA then derives from the conclusion that there is a sufficient body of relevant scientific evidence that shows consistency across different studies and among different researchers. When such a high level of confidence is lacking--- that is the evidence for relationship is credible but does not meet the SSA standard, then the proposed claim for the substance/disease relationship should include qualifying language that expresses the level of scientific evidence to support the relationship.
The health claim language should reflect the level of scientific evidence with specificity and accuracy. However, gaps in the scientific evidence may sometimes limit the information that can be included in the claims. For example, when the scientific evidence is limited but credible, it may not be possible for the qualified health claim to identify an amount of the substance that is associated with a reduced risk of the disease.
When there is credible evidence available to suggest a relationship between the substance and disease, it is important to determine whether the substance has an independent role in the relationship or whether its role is based on the inclusion or replacement (i.e., substitution) of other substances. An example of where the evaluation of the independent role of a substance can be challenging is when the substance is a conventional food or macronutrient (e.g., fat or carbohydrate). When evaluating the relationship between a conventional food or macronutrient, the inclusion of either in the diet usually requires the removal of other conventional foods or macronutrients (i.e., substitution to yield isocaloric diets). If it is determined that the substance does not play an independent role and/or requires the reduction or inclusion of another substance to show a beneficial effect, the claim language will reflect this finding.
FDA may reevaluate a health claim in response to a petitioner or on its own initiative, and when it does so it intends to use the scientific evaluation process described above. To maximize the public health benefit of its health claims review, FDA intends to evaluate new information that becomes available to determine whether it necessitates a change to an existing SSA or qualified health claim. For example, scientific evidence may become available that will (1) support the revision of claim language for an SSA or qualified health claim, (2) support change of an SSA claim to a QHC or support change of a QHC to an SSA claim, or (3) raise safety concerns about the substance that is the subject of a health claim or otherwise no longer support a health claim (SSA or QHC).
(1) This guidance has been prepared by the Office of Nutrition, Labeling and Dietary Supplements in FDA's Center for Food Safety and Applied Nutrition.
(2) For brevity, "disease" will be used as shorthand for "disease or health-related condition." "Disease or health-related condition" is defined as damage to an organ, part, structure, or system of the body such that it does not function properly (e.g., cardiovascular disease), or a state of health leading to such dysfunctioning (e.g., hypertension). 21 CFR 101.14(a)(5).
(3) This new guidance document, if finalized, will replace FDA's guidance "Interim Evidence-based Ranking System for Scientific Data" which addresses the science review of qualified health claims. Although the interim evidence-based ranking system guidance includes a section on ranking the strength of the scientific evidence, this draft guidance document does not include such a section because studies are being conducted on the consumer's understanding of various possible ranking systems that could be used to describe the strength of the evidence for a health claim. FDA intends to reexamine its ranking systems and issue appropriate guidance once these studies are completed. In addition, if this guidance is finalized, it will replace "Guidance for Industry: Significant Scientific Agreement in the Review of Health Claims for Conventional Foods and Dietary Supplements."
(4) In 1997, Congress enacted the Food and Drug Administration Modernization Act, which established an alternative authorization procedure for health claims based on authoritative statements of certain federal scientific bodies or the National Academy of Sciences. This guidance document does not address that alternative procedure.
(5) See guidance entitled "Interim Procedures for Qualified Health Claims in the Labeling of Conventional Human Food and Human Dietary Supplements," July 10, 2003.
(6) See guidance entitled "Interim Evidence-based Ranking System for Scientific Data," July 10, 2003.
(7) See 21 U.S.C. 321(ff)(1).
(8) "Nutritive value" is defined in 21 CFR 101.14(a)(3) as value in sustaining human existence by such processes as promoting growth, replacing loss of essential nutrients, or providing energy.
(9) Confounders are factors associated with both the disease in question and the intervention, and that if not controlled for prevent an investigator from being able to conclude that an outcome was caused by an intervention.
(10) Free-living populations represent those who consume diets and have lifestyles (e.g., smoking, drinking, and exercise) of their own choice.
(11) Biomarkers of intake are measurements of the substance itself or a metabolite of the substance in biological samples (e.g., serum selenium) that have been validated to confirm that they reflect the intake of that substance.
(12)Relative risk is expressed as the ratio of the risk (disease incidence) in exposed individuals to that in unexposed individuals. It is calculated in prospective cohorts by measuring exposure of the substance in subjects with and without disease. An adjusted relative risk controls for potential confounders.
(13) An example of a case-control study is a study design that assesses parameters related to the frequency and distribution of disease in a population, such as leading cause of death.
(14) An odds ratio is the odds of developing the disease in exposed compared to unexposed individuals. It is calculated in case control studies by measuring disease development in subjects based on exposure to the substance. Adjusted odds ratio controls for potential confounders.
(15) In a few cross-sectional studies, it is a time-series study that compares outcomes during different time periods (e.g., whether the rate of occurrence of a particular outcome during one five-year period changed during a subsequent five-year period).
(16) Review articles summarize the findings of individual studies.
(17) Other examples include book chapters, abstracts, letters, and committee reports.
(18) A meta-analysis is the process of systematically combining and evaluating the results of clinical trials that have been completed or terminated (Spilker, 1991).
(19) Risk biomarkers are biological indicators that signal a changed physiological state that is associated with the risk of a disease.
(20) A diet that is provided to all study groups prior to randomization.
(21) Time period within a cross-over design study during which subjects do not receive an intervention.
(22) Correlation is evaluated using correlation coefficients (r). Correlation coefficients range from -1 (negative correlation) through +1 (positive correlation). The closer to 1, the stronger the correlation; the closer to zero, the weaker the correlation.
(23) "Validation of the food frequency questionnaire method is essential, as incorrect information may lead to false associations between dietary factors and disease or disease-related markers." Cade, J., Thompson, R., Burley, V., and Warm D. Development, Validation and Utilization of Food-Frequency Questionnaires-A Review. Public Health Nutrition, 5: page 573, 2002. See, also, Subar, A., et al., Comparative validation of the Block, Willett, and National Cancer Institute Food Frequency Questionnaires, American Journal of Epidemiology, 154: 1089-1099, 2001.
(24) In Pearson v. Shalala, the D.C. Circuit noted that FDA had "logically determined" that the consumption of antioxidant vitamins in dietary supplement form could not be scientifically proven to reduce the risk of cancer where the existing research had examined only foods containing antioxidant vitamins, as the effect of those foods on reducing the risk of cancer may have resulted from other substances in those foods. 164 F.3d 650, 568 (D.C. Cir. 1999). The D.C.Circuit, however, concluded that FDA's concern with granting antioxidant vitamins a qualified health claim could be accommodated by simply adding a prominent disclaimer noting that the evidence for such a claim was inconclusive, given that the studies supporting the claim were based on foods containing other substances that might actually be responsible for reducing the risk of cancer. Id. The court noted that FDA did not assert that the dietary supplements at issue would "threaten consumer's health and safety." Id. at 656. There is, however, a more fundamental problem with allowing qualified health claims for individual nutrients based on studies of foods containing those nutrients than the problem the D.C. Circuit held could be cured with a disclaimer. Even if the effect of the specific component of the food could be determined with certainty, recent scientific findings on the complex nature of nutrient-food interactions and on the relationship between diet, biological parameters, and disease indicate that nutrients found to have health benefits when consumed in one food or group of foods may not necessarily have the same beneficial effect when they are consumed in dietary supplement form or in other foods. See Lichtenstein and Russell (2005). For example, not only have studies on dietary supplements established that the benefits associated with the dietary intake of certain nutrients do not materialize when the nutrients are taken as a supplement, but some of these studies have actually indicated an increased risk for the very disease the nutrients were predicted to prevent. Id. Thus, a study based on intake of a specific food or foods provides no information from which scientific conclusions may be drawn for the nutrient itself. Further, even if the nutrients are consumed in other foods rather than in a dietary supplement, the physiological effects may be different because the food matrix can affect the bioavailability and bioactivity of the nutrients. Id.
(25) Replication of scientific findings is important for evaluating the strength of scientific evidence (Wilson, E.B. An Introduction to Scientific Research. Dover Publications, 1990; pages 46-48).
(26) In this guidance, "consistency" is used to mean the level of agreement among the studies from which scientific conclusions could be drawn about the substance/disease relationship.
(27) Consistency of findings among similar and different study designs is important for evaluating causation and the strength of scientific evidence (Hill A.B. The environment and disease: association or causation? Proceedings of the Royal Society of Medicine. 1965;58:295-300).; see also Systems to rate the scientific evidence from the Agency for Healthcare Research and Quality, which defines "consistency " as " the extent to which similar findings are reported using similar and different study designs." [http://www.ahrq.gov/clinic/epcsums/strengthsum.htm#Contents]
(28) Tertile, quartile and quintile of intake is the result of dividing a study population into 3, 4 or 5 groups, respectively, such that the average intake level of the substance varies across the groups (e.g., lowest intake group represents the lowest tertile of intake and the highest intake group represents the highest tertile). The study population is divided such that each group has the same number of subjects.