United States National Library of Medicine National Institutes of Health

HTA 101: Glossary

Absolute risk reduction: a measure of treatment effect that compares the probability (or mean) of a type of outcome in the control group with that of a treatment group, [i.e.: Pc - Pt (or µc - µt)]. For instance, if the results of a trial were that the probability of death in a control group was 25% and the probability of death in a treatment group was 10%, the absolute risk reduction would be (0.25 - 0.10) = 0.15. (See also number needed to treat, odds ratio, and relative risk reduction.)

Accuracy: the degree to which a measurement (e.g., the mean estimate of a treatment effect) is true or correct. An estimate can be accurate, yet not be precise, if it is based upon an unbiased method that provides observations having great variation (i.e., not close in magnitude to each other). (Contrast with precision.)

Alpha (α): the probability of a Type I (false-positive) error. In hypothesis testing, the α-level is the threshold for defining statistical significance. For instance, setting α at a level of 0.05 implies that investigators accept that there is a 5% chance of concluding incorrectly that an intervention is effective when it has no true effect. The α-level is commonly set at 0.01 or 0.05 or 0.10.

Benchmarking: a quality assurance process in which an organization sets goals and measures its performance in comparison to those of the products, services, and practices of other organizations that are recognized as leaders.

Beta (β): the probability of a Type II (false-negative) error. In hypothesis testing, β is the probability of concluding incorrectly that an intervention is not effective when it has true effect. (1-β) is the power to detect an effect of an intervention if one truly exists.

Bias: in general, any factor that distorts the true nature of an event or observation. In clinical investigations, a bias is any systematic factor other than the intervention of interest that affects the magnitude of (i.e., tends to increase or decrease) an observed difference in the outcomes of a treatment group and a control group. Bias diminishes the accuracy (though not necessarily the precision) of an observation. Randomization is a technique used to decrease this form of bias. Bias also refers to a prejudiced or partial viewpoint that would affect someone's interpretation of a problem. Double blinding is a technique used to decrease this type of bias.

Bibliographic database: an indexed computer or printed source of citations of journal articles and other reports in the literature. Bibliographic citations typically include author, title, source, abstract, and/or related information (including full text in some cases). Examples are MEDLINE and EMBASE.

Blinding: also known as "masking," the knowledge of patients and/or investigators about whether individual patients are receiving the investigational intervention(s) or the control (or standard) intervention(s) in a clinical trial. Blinding is intended to eliminate the possibility that knowledge of which intervention is being received will affect patient outcomes or investigator behaviors that may affect outcomes. Blinding is not always practical (e.g. when comparing surgery to drug treatment), but it should be used whenever it is possible and compatible with optimal patient care. A single-blinded trial is one in which this knowledge is withheld only from patients; a double-blinded trial is one in which the knowledge is also withheld from investigators; and a triple-blinded trial is one in which the knowledge is also withheld from the statisticians or other analysts of trial data.

Case-control study: a retrospective observational study designed to determine the relationship between a particular outcome of interest (e.g., disease or condition) and a potential cause (e.g., an intervention, risk factor, or exposure). Investigators identify a group of patients with a specified outcome (cases) and a group of patients without the specified outcome (controls). Investigators then compare the histories of the cases and the controls to determine the rate or level at which each group experienced a potential cause. As such, this study design leads from outcome (disease or condition) to cause (intervention, risk factor, or exposure).

Case series: see series.

Case study: an uncontrolled (prospective or retrospective) observational study involving an intervention and outcome in a single patient. (Also known as a single case report or anecdote.)

Causal pathway: also known as an "analytical framework," a depiction (e.g., in a schematic) of direct and indirect linkages between interventions and outcomes. For a clinical problem, a causal pathway typically includes a patient population, one or more alternative interventions (e.g., screening, diagnosis, and/or treatment), intermediate outcomes (e.g., biological markers), and health outcomes. Causal pathways are intended to provide clarity and explicitness in defining the questions to be addressed in an assessment; they are useful in identifying pivotal linkages for which evidence may be lacking.

Citation: the record of an article, book, or other report in a bibliographic database that includes summary descriptive information, e.g., authors, title, abstract, source, and indexing terms.

Clinical pathway: a multidisciplinary set of daily prescriptions and outcome targets for managing the overall care of a specific type of patient, e.g., from pre-admission to post-discharge for patients receiving inpatient care. Clinical pathways often are intended to maintain or improve quality of care and decrease costs for patients in particular diagnosis-related groups.

Clinical practice guidelines: a systematically developed statement to assist practitioner and patient decisions about appropriate health care for one or more specific clinical circumstances. The development of clinical practice guidelines can be considered to be a particular type of HTA; or, it can be considered to be one of the types of policymaking that is informed or supported by HTA.

Clinical significance: a conclusion that an intervention has an effect that is of practical meaning to patients and health care providers. Even though an intervention is found to have a statistically significant effect, this effect might not be clinically significant. In a trial with a large number of patients, a small difference between treatment and control groups may be statistically significant but clinically unimportant. In a trial with few patients, an important clinical difference may be observed that does not achieve statistical significance. (A larger trial may be needed to confirm that this is a statistically significant difference.)

Cohort study: an observational study in which outcomes in a group of patients that received an intervention are compared with outcomes in a similar group i.e., the cohort, either contemporary or historical, of patients that did not receive the intervention. In an adjusted- (or matched-) cohort study, investigators identify (or make statistical adjustments to provide) a cohort group that has characteristics (e.g., age, gender, disease severity) that are as similar as possible to the group that experienced the intervention.

Compliance: a measure of the extent to which patients undergo an assigned treatment or regimen, e.g., taking drugs, undergoing a medical or surgical procedure, doing an exercise regimen, or abstaining from smoking.

Concealment of allocation: the process used to assign patients to alternative groups in an RCT in a manner that prevents foreknowledge (by the person managing the allocation as well as the patients) of this assignment. Medical record numbers, personal identification numbers, or birthdays are not adequate for concealment of allocation. Certain centralized randomization schemes and sequentially numbered sealed, opaque envelopes are among adequate methods of allocation concealment.

Concurrent nonrandomized control: a control group that is observed by investigators at the same time as the treatment group, but that was not established using random assignment of patients to control and treatment groups. Differences in the composition of the treatment and control groups may result.

Confidence interval: depicts the range of uncertainty about an estimate of a treatment effect. It is calculated from the observed differences in outcomes of the treatment and control groups and the sample size of a study. The confidence interval (CI) is the range of values above and below the point estimate that is likely to include the true value of the treatment effect. The use of CIs assumes that a study provides one sample of observations out of many possible samples that would be derived if the study were repeated many times. Investigators typically use CIs of 90%, 95%, or 99%. For instance, a 95% CI indicates that there is a 95% probability that the CI calculated from a particular study includes the true value of a treatment effect. If the interval includes a null treatment effect (usually 0.0, but 1.0 if the treatment effect is calculated as an odds ratio or relative risk), the null hypothesis of no true treatment effect cannot be rejected.

Consensus development: various forms of group judgment in which a group (or panel) of experts interacts in assessing an intervention and formulating findings by vote or other process of reaching general agreement. These process may be informal or formal, involving such techniques as the nominal group and Delphi techniques.

Contraindication: a clinical symptom or circumstance indicating that the use of an otherwise advisable intervention would be inappropriate.

Control group: a group of patients that serves as the basis of comparison when assessing the effects of the intervention of interest that is given to the patients in the treatment group. Depending upon the circumstances of the trial, a control group may receive no treatment, a "usual" or "standard" treatment, or a placebo. To make the comparison valid, the composition of the control group should resemble that of the treatment group as closely as possible. (See also historical control and concurrent nonrandomized control.)

Controlled clinical trial: a prospective experiment in which investigators compare outcomes of a group of patients receiving an intervention to a group of similar patients not receiving the intervention. Not all clinical trials are RCTs, though all RCTs are clinical trials.

Controlled vocabulary: a system of terms, involving, e.g., definitions, hierarchical structure, and cross-references, that is used to index and retrieve a body of literature in a bibliographic, factual, or other database. An example is the MeSH controlled vocabulary used in MEDLINE and other MEDLARS databases of the NLM.

Cost-benefit analysis: a comparison of alternative interventions in which costs and outcomes are quantified in common monetary units.

Cost-consequence analysis: A form of cost-effectiveness analysis in which the components of incremental costs (of therapies, hospitalization, etc.) and consequences (health outcomes, adverse effects, etc.) of alternative interventions or programs are computed and displayed, without aggregating these results (e.g., into a cost-effectiveness ratio).

Cost-effectiveness analysis: a comparison of alternative interventions in which costs are measured in monetary units and outcomes are measured in non-monetary units, e.g., reduced mortality or morbidity.

Cost-minimization analysis: a determination of the least costly among alternative interventions that are assumed to produce equivalent outcomes.

Cost-utility analysis: a form of cost-effectiveness analysis of alternative interventions in which costs are measured in monetary units and outcomes are measured in terms of their utility, usually to the patient, e.g., using QALYs.

Cost-of-illness analysis: a determination of the economic impact of an disease or health condition, including treatment costs; this form of study does not address benefits/outcomes.

Crossover bias: occurs when some patients who are assigned to the treatment group in a clinical study do not receive the intervention or receive another intervention, or when some patients in the control group receive the intervention (e.g., outside the trial). If these crossover patients are analyzed with their original groups, this type of bias can "dilute" (diminish) the observed treatment effect.

Crossover design: a clinical trial design in which patients receive, in sequence, the treatment (or the control), and then, after a specified time, switch to the control (or treatment). In this design, patients serve as their own controls, and randomization may be used to determine the order in which a patient receives the treatment and control.

Cross-sectional study: a (prospective or retrospective) observational study in which a group is chosen (sometimes as a random sample) from a certain larger population, and the exposures of people in the group to an intervention and outcomes of interest are determined.

Database (or register): any of a wide variety of repositories (often computerized) for observations and related information about a group of patients (e.g., adult males living in Göteborg) or a disease (e.g.,

hypertension) or an intervention (e.g., antihypertensive drug therapy) or other events or characteristics. Depending upon criteria for inclusion in the database, the observations may have controls. Although these can be useful, a variety of confounding factors (e.g., no randomization and possible selection bias in the process by which patients or events are recorded) make them relatively weak methods for determining causal relationships between an intervention and an outcome.

Decision analysis: an approach to decision making under conditions of uncertainty that involves modeling of the sequences or pathways of multiple possible strategies (e.g., of diagnosis and treatment for a particular clinical problem) to determine which is optimal. It is based upon available estimates (drawn from the literature or from experts) of the probabilities that certain events and outcomes will occur and the values of the outcomes that would result from each strategy. A decision tree is a graphical representation of the alternate pathways.

Delphi technique: an iterative group judgment technique in which a central source forwards surveys or questionnaires to isolated, anonymous (to each other) participants whose responses are collated/summarized and recirculated to the participants in multiple rounds for further modification/critique, producing a final group response (sometimes statistical).

Direct costs: the fixed and variable costs of all resources (goods, services, etc.) consumed in the provision of an intervention as well as any consequences of the intervention such as adverse effects or goods or services induced by the intervention. Includes direct medical costs and direct nonmedical costs such as transportation or child care.

Disability-adjusted life years (DALYs): a unit of health care status that adjusts age-specific life expectancy by the loss of health and years of life due to disability from disease or injury. DALYs are often used to measure the global burden of disease.

Discounting: the process used in cost analyses to reduce mathematically future costs and/or benefits/outcomes to their present value. These adjustments reflect that given levels of costs and benefits occurring in the future usually have less value in the present than the same levels of costs and benefits realized in the present.

Discount rate: the interest rate used to discount or calculate future costs and benefits so as to arrive at their present values, e.g., 3% or 5%. This is also known as the opportunity cost of capital investment. Discount rates are usually based on government bonds or market interest rates for cost of capital whose maturity is about same as the time period during which the intervention or program being evaluated. For example, the discount rate used by the US federal government is based on the Treasury Department cost of borrowing funds and will vary, depending on the period of analysis.

Disease management: a systematic process of managing care of patients with specific diseases or conditions (particularly chronic conditions) across the spectrum of outpatient, inpatient, and ancillary services. The purposes of disease management may include: reduce acute episodes, reduce hospitalizations, reduce variations in care, improve health outcomes, and reduce costs. Disease management may involve continuous quality improvement or other management paradigms. It may involve a cyclical process of following practice protocols, measuring the resulting outcomes, feeding those results back to clinicians, and revising protocols as appropriate.

Dissemination: any process by which information is transmitted (made available or accessible) to intended audiences or target groups.

Effect size: same as treatment effect. Also, a dimensionless measure of treatment effect that is typically used for continuous variables and is usually defined as the difference in mean outcomes of the treatment and control group divided by the standard deviation of the outcomes of the control group. One type of meta-analysis involves averaging the effect sizes from multiple studies.

Effectiveness: the benefit (e.g., to health outcomes) of using a technology for a particular problem under general or routine conditions, for example, by a physician in a community hospital or by a patient at home.

Effectiveness research: see outcomes research.

Efficacy: the benefit of using a technology for a particular problem under ideal conditions, for example, in a laboratory setting, within the protocol of a carefully managed randomized controlled trial, or at a "center of excellence."

Endpoint: a measure or indicator chosen for determining an effect of an intervention.

Equipoise: a state of uncertainty regarding whether alternative health care interventions will confer more favorable outcomes, including balance of benefits and harms. Under the principle of equipoise, a patient should be enrolled in a randomized contolled trial only if there is substantial uncertainty, (an expectation for equal likelihood) about which intervention will benefit the patient most.

Evidence-based medicine: the use of current best evidence from scientific and medical research to make decisions about the care of individual patients. It involves formulating questions relevant to the care of particular patients, searching the scientific and medical literature, identifying and evaluating relevant research results, and applying the findings to patients.

Evidence table: a summary display of selected characteristics (e.g., of methodological design, patients, outcomes) of studies of a particular intervention or health problem.

External validity: the extent to which the findings obtained from an investigation conducted under particular circumstances can be generalized to other circumstances. To the extent that the circumstances of a particular investigation (e.g., patient characteristics or the manner of delivering a treatment) differ from the circumstances of interest, the external validity of the findings of that investigation may be questioned.

Factual database: an indexed computer or printed source that provides reference or authoritative information, e.g., in the form of guidelines for diagnosis and treatment, patient indications, or adverse effects.

False negative error: occurs when the statistical analysis of a trial detects no difference in outcomes between a treatment group and a control group when in fact a true difference exists. This is also known as a Type II error. The probability of making a Type II error is known as β (beta).

False positive error: occurs when the statistical analysis of a trial detects a difference in outcomes between a treatment group and a control group when in fact there is no difference. This is also known as a Type I error. The probability of a Type I error is known as α (alpha).

Follow-up: the ability of investigators to observe and collect data on all patients who were enrolled in a trial for its full duration. To the extent that data on patient events relevant to the trial are lost, e.g., among patients who move away or otherwise withdraw from the trial, the results may be affected, especially if there are systematic reasons why certain types of patients withdraw. Investigators should report on the number and type of patients who could not be evaluated, so that the possibility of bias may be considered.

Gray literature: research reports that are not found in traditional peer-reviewed publications, for example: government agency monographs, symposium proceedings, and unpublished company reports.

Health-related quality of life (HRQL) measures: patient outcome measures that extend beyond traditional measures of mortality and morbidity, to include such dimensions as physiology, function, social activity, cognition, emotion, sleep and rest, energy and vitality, health perception, and general life satisfaction. (Some of these are also known as health status, functional status, or quality of life measures.)

Health technology assessment (HTA): the systematic evaluation of properties, effects, and/or impacts of health care technology. It may address the direct, intended consequences of technologies as well as their indirect, unintended consequences. Its main purpose is to inform technology-related policymaking in health care. HTA is conducted by interdisciplinary groups using explicit analytical frameworks drawing from a variety of methods.

Health services research: a field of inquiry that examines the impact of the organization, financing and management of health care services on the delivery, quality, cost, access to and outcomes of such services.

Healthy-years equivalents (HYEs): the number of years of perfect health that are considered equivalent to (i.e., have the same utility as) the remaining years of life in their respective health states.

Historical control: a control group that is chosen from a group of patients who were observed at some previous time. The use of historical controls raises concerns about valid comparisons because they are likely to differ from the current treatment group in their composition, diagnosis, disease severity, determination of outcomes, and/or other important ways that would confound the treatment effect. It may be feasible to use historical controls in special instances where the outcomes of a standard treatment (or no treatment) are well known and vary little for a given patient population.

Hypothesis testing: a means of interpreting the results of a clinical trial that involves determining the probability that an observed treatment effect could have occurred due to chance alone if a specified hypothesis were true. The specified hypothesis is normally a null hypothesis, made prior to the trial, that the intervention of interest has no true effect. Hypothesis testing is used to determine if the null hypothesis can or cannot be rejected.

Incidence: the rate of occurrence of new cases of a disease or condition in a population at risk during a given period of time, usually one year.

Indication: a clinical symptom or circumstance indicating that the use of a particular intervention would be appropriate.

Indirect costs: the cost of time lost from work and decreased productivity due to disease, disability, or death. (In cost accounting, it refers to the overhead or fixed costs of producing goods or services.)

Intangible costs: the cost of pain and suffering resulting from a disease, condition, or intervention.

Intention to treat analysis: a type of analysis of clinical trial data in which all patients are included in the analysis based on their original assignment to intervention or control groups, regardless of whether patients failed to fully participate in the trial for any reason, including whether they actually received their allocated treatment, dropped out of the trial, or crossed over to another group.

Internal validity: the extent to which the findings of a study accurately represent the causal relationship between an intervention and an outcome in the particular circumstances of that study. The internal validity of a trial can be suspect when certain types of biases in the design or conduct of a trial could have affected outcomes, thereby obscuring the true direction, magnitude, or certainty of the treatment effect.

Investigational Device Exemption (IDE): a regulatory category and process in which the US Food and Drug Administration (FDA) allows specified use of an unapproved health device in controlled settings for purposes of collecting data on safety and efficacy/effectiveness; this information may be used subsequently in a premarketing approval application.

Investigational New Drug Application (IND): an application submitted by a sponsor to the US FDA prior to human testing of an unapproved drug or of a previously approved drug for an unapproved use.

Language bias: a form of bias that may affect the findings of a systematic review or other literature synthesis that arises when research reports are not identified or are excluded based on the language in which they are published.

Large, simple trials: prospective, randomized controlled trials that use large numbers of patients, broad patient inclusion criteria, multiple study sites, minimal data requirements, and electronic registries; their purposes include detecting small and moderate treatment effects, gaining effectiveness data, and improving external validity.

Literature review: a summary and interpretation of research findings reported in the literature. May include unstructured qualitative reviews by single authors as well as various systematic and quantitative procedures such as meta-analysis. (Also known as overview.)

Marginal benefit: the additional benefit (e.g., in units of health outcome) produced by an additional resource use (e.g., another health care intervention).

Marginal cost: the additional cost required to produce an additional unit of benefit (e.g., unit of health outcome).

Markov model: A type of quantitative modeling that involves a specified set of mutually exclusive and exhaustive states (e.g., of a given health status), and for which there are transition probabilities of moving from one state to another (including of remaining in the same state). Typically, states have a uniform time period, and transition probabilities remain constant over time.

Meta-analysis: systematic methods that use statistical techniques for combining results from different studies to obtain a quantitative estimate of the overall effect of a particular intervention or variable on a defined outcome. This combination may produce a stronger conclusion than can be provided by any individual study. (Also known as data synthesis or quantitative overview.)

Monte Carlo simulation: a technique used in computer simulations that uses sampling from a random number sequence to simulate characteristics or events or outcomes with multiple possible values. For example, this can be used to represent or model many individual patients in a population with ranges of values for certain health characteristics or outcomes. In some cases, the random components are added to the values of a known input variable for the purpose of determining the effects of fluctuations of this variable on the values of the output variable.

Moving target problem: changes in health care that can render the findings of HTAs out of date, sometimes before their results can be implemented. Included are changes in the focal technology, changes in the alternative or complementary technologies i.e., that are used for managing a given health problem, emergence of new competing technologies, and changes in the application of the technology (e.g., to different patient populations or to different health problems).

N of 1 trial: a clinical trial in which a single patient is the total population for the trial, including a single case study. An N of 1 trial in which random allocation is used to determine the order in which an experimental and a control intervention are given to a patient is an N of 1 RCT.

Negative predictive value: see predictive value negative.

New Drug Application (NDA): an application submitted by a sponsor to the FDA for approval to market a new drug (a new, nonbiological molecular entity) for human use in US interstate commerce.

Nonrandomized controlled trial: a controlled clinical trial that assigns patients to intervention and control groups using a method that does not involve randomization, e.g., at the convenience of the investigators or some other technique such as alternate assignment.

Nominal group technique: a face-to-face group judgment technique in which participants generate silently, in writing, responses to a given question/problem; responses are collected and posted, but not identified by author, for all to see; responses are openly clarified, often in a round-robin format; further iterations may follow; and a final set of responses is established by voting/ranking.

Null hypothesis: in hypothesis testing, the hypothesis that an intervention has no effect, i.e., that there is no true difference in outcomes between a treatment group and a control group. Typically, if statistical tests indicate that the P value is at or above the specified a-level (e.g., 0.01 or 0.05), then any observed treatment effect is not statistically significant, and the null hypothesis cannot be rejected. If the P value is less than the specified a-level, then the treatment effect is statistically significant, and the null hypothesis is rejected. If a confidence interval (e.g., of 95% or 99%) includes zero treatment effect, then the null hypothesis cannot be rejected.

Number needed to treat: a measure of treatment effect that provides the number of patients who need to be treated to prevent one outcome event. It is the inverse of absolute risk reduction (1 ÷ absolute risk reduction); i.e., 1.0 ÷ (Pc - Pt). For instance, if the results of a trial were that the probability of death in a control group was 25% and the probability of death in a treatment group was 10%, the number needed to treat would be 1.0 ÷ (0.25 - 0.10) = 6.7 patients. (See also absolute risk reduction, relative risk reduction, and odds ratio.)

Observational study: a study in which the investigators do not manipulate the use of, or deliver, an intervention (e.g., do not assign patients to treatment and control groups), but only observe patients who are (and sometimes patients who are not as a basis of comparison) exposed to the intervention, and interpret the outcomes. These studies are more subject to selection bias than experimental studies such as randomized controlled trials.

Odds ratio: a measure of treatment effect that compares the probability of a type of outcome in the treatment group with the outcome of a control group, i.e., [Pt ÷ (1 - Pt)] [Pc ÷ (1 - Pc)]. For instance, if the results of a trial were that the probability of death in a control group was 25% and the probability of death in a treatment group was 10%, the odds ratio of survival would be [0.10 ÷ (1.0 - 0.10)] ÷ [(0.25 ÷

(1.0 - 0.25)] = 0.33. (See also absolute risk reduction, number needed to treat, and relative risk.)

Outcomes research: evaluates the impact of health care on the health outcomes of patients and populations. It may also include evaluation of economic impacts linked to health outcomes, such as cost effectiveness and cost utility. Outcomes research emphasizes health problem- (or disease-) oriented evaluations of care delivered in general, real-world settings; multidisciplinary teams; and a wide range of outcomes, including mortality, morbidity, functional status, mental well-being, and other aspects of health-related quality of life. It may entail any in a range of primary data collection methods and synthesis methods that combine data from primary studies.

Pvalue: in hypothesis testing, the probability that an observed difference between the intervention and control groups is due to chance alone if the null hypothesis is true. If P is less than the α-level (typically

0.01 or 0.05) chosen prior to the study, then the null hypothesis is rejected.

Parallel group (or independent group) trial: a trial that compares two contemporaneous groups of patients, one of which receives the treatment of interest and one of which is a control group (e.g., a randomized controlled trial). (Some parallel trials have more than one treatment group; others compare two treatment groups, each acting as a control for the other.)

Patient selection bias: a bias that occurs when patients assigned to the treatment group differ from patients assigned to the control group in ways that can affect outcomes, e.g., age or disease severity. If the two groups are constituted differently, it is difficult to attribute observed differences in their outcomes to the intervention alone. Random assignment of patients to the treatment and control groups minimizes opportunities for this bias.

Peer review: the process by which manuscripts submitted to health, biomedical, and other scientifically oriented journals and other publications are evaluated by experts in appropriate fields (usually anonymous to the authors) to determine if the manuscripts are of adequate quality for publication.

Phase I, II, III, and IV studies: phases of clinical trials of new technologies (usually drugs) in the development and approval process required by the FDA (or other regulatory agencies). Phase I trials typically involve approximately 20-80 healthy volunteers to determine a drug's safety, safe dosage range, absorption, metabolic activity, excretion, and the duration of activity. Phase II trials are controlled trials in approximately 100-300 volunteer patients (with disease) to determine the drug's efficacy and adverse reactions (sometimes divided into Phase IIa pilot trials and Phase IIb well-controlled trials). Phase III trials are larger controlled trials in approximately 1,000-3,000 patients to verify efficacy and monitor adverse reactions during longer-term use (sometimes divided into Phase IIIa trials conducted before regulatory submission and Phase IIIb trials conducted after regulatory submission but before approval). Phase IV trials are postmarketing studies to monitor long-term effects and provide additional information on safety and efficacy, including for different regimens patient groups.

Placebo: an inactive substance or treatment given to satisfy a patient's expectation for treatment. In some controlled trials (particularly of drug treatments) placebos that are made to be indistinguishable by patients (and providers when possible) from the true intervention are given to the control group to be used as a comparative basis for determining the effect of the investigational treatment.

Placebo effect: the effect on patient outcomes (improved or worsened) that may occur due to the expectation by a patient (or provider) that a particular intervention will have an effect. The placebo effect is independent of the true effect (pharmacological, surgical, etc.) of a particular intervention. To control for this, the control group in a trial may receive a placebo.

Positive predictive value: see predictive value positive.

Power: the probability of detecting a treatment effect of a given magnitude when a treatment effect of at least that magnitude truly exists. For a true treatment effect of a given magnitude, power is the probability of avoiding Type II error, and is generally defined as (1 - β).

Precision: the degree to which a measurement (e.g., the mean estimate of a treatment effect) is derived from a set of observations having small variation (i.e., close in magnitude to each other). A narrow confidence interval indicates a more precise estimate of effect than a wide confidence interval. A precise estimate is not necessarily an accurate one. (Contrast with accuracy.)

Predictive value negative: an operating characteristic of a diagnostic test; predictive value negative is the proportion of persons with a negative test who truly do not have the disease, determined as: [true negatives ÷ (true negatives + false negatives)]. It varies with the prevalence of the disease in the population of interest. (Contrast with predictive value negative.)

Predictive value positive: an operating characteristic of a diagnostic test; predictive value positive is the proportion of persons with a positive test who truly have the disease, determined as: [true positives ÷ (true positives + false positives)]. It varies with the prevalence of the disease in the population of interest. (Contrast with predictive value negative.)

Premarketing Approval (PMA) Application: an application made by the sponsor of a health device to the FDA for approval to market the device in US interstate commerce. The application includes information documenting the safety and efficacy/effectiveness of the device.

Prevalence: the number of people in a population with a specific disease or condition at a given time, usually expressed as a ratio of the number of affected people to the total population.

Primary study: an investigation that collects original (primary) data from patients, e.g., randomized controlled trials, observational studies, series of cases, etc. (Contrast with synthetic/integrative study).

Probability distribution: portrays the relative likelihood that a range of values is the true value of a treatment effect. This distribution often appears in the form of a bell-shaped curve. An estimate of the most likely true value of the treatment effect is the value at the highest point of the distribution. The area under the curve between any two points along the range gives the probability that the true value of the treatment effect lies between those two points. Thus, a probability distribution can be used to determine an interval that has a designated probability (e.g., 95%) of including the true value of the treatment effect.

Prospective study: a study in which the investigators plan and manage the intervention of interest in selected groups of patients. As such, investigators do not know what the outcomes will be when they undertake the study. (Contrast with retrospective study.)

Publication bias: unrepresentative publication of research reports that is not due to the quality of the research but to other characteristics, e.g., tendencies of investigators to submit, and publishers to accept, positive research reports (i.e., ones with results showing a beneficial treatment effect of a new intervention).

Quality-adjusted life year (QALY): a unit of health care outcomes that adjusts gains (or losses) in years of life subsequent to a health care intervention by the quality of life during those years. QALYs can provide a common unit for comparing cost-utility across different interventions and health problems. Analogous units include disability-adjusted life years (DALYs) and healthy-years equivalents (HYEs).

Quality assessment: a measurement and monitoring function of quality assurance for determining how well health care is delivered in comparison with applicable standards or acceptable bounds of care.

Quality assurance: activities intended to ensure that the best available knowledge concerning the use of health care to improve health outcomes is properly implemented. This involves the implementation of health care standards, including quality assessment and activities to correct, reduce variations in, or otherwise improve health care practices relative to these standards.

Quality of care: the degree to which health care is expected to increase the likelihood of desired health outcomes and is consistent with standards of health care. (See also quality assessment and quality assurance.)

Random variation (or random error): the tendency for the estimated magnitude of a parameter (e.g., based upon the average of a sample of observations of a treatment effect) to deviate randomly from the true magnitude of that parameter. Random variation is independent of the effects of systematic biases. In general, the larger the sample size is, the lower the random variation is of the estimate of a parameter. As random variation decreases, precision increases.

Randomization: a technique of assigning patients to treatment and control groups that is based only on chance distribution. It is used to diminish patient selection bias in clinical trials. Proper randomization of patients is an indifferent yet objective technique that tends to neutralize patient prognostic factors by spreading them evenly among treatment and control groups. Randomized assignment is often based on computer-generated tables of random numbers.

Randomized controlled trial (RCT): a prospective experiment in which investigators randomly assign an eligible sample of patients to one or more treatment groups and a control group and follow patients' outcomes. (Also known as randomized clinical trial.)

Receiver operating characteristic (ROC) curve: a graphical depiction of the relationship between the true positive ratio (sensitivity) and false positive ratio (1 - specificity) as a function of the cutoff level of a disease (or condition) marker. ROC curves help to demonstrate how raising or lowering the cutoff point for defining a positive test result affects tradeoffs between correctly identifying people with a disease (true positives) and incorrectly labeling a person as positive who does not have the condition (false positives).

Register: see database.

Reliability: the extent to which an observation that is repeated in the same, stable population yields the same result (i.e., test-retest reliability). Also, the ability of a single observation to distinguish consistently among individuals in a population.

Relative risk reduction: a type of measure of treatment effect that compares the probability of a type of outcome in the treatment group with that of a control group, i.e.: (Pc - Pt) ÷ Pc. For instance, if the results of a trial show that the probability of death in a control group was 25% and the probability of death in a control group was 10%, the relative risk reduction would be: (0.25 - 0.10) ÷ 0.25 = 0.6. (See also absolute risk reduction, number needed to treat, and odds ratio.)

Retrospective study: a study in which investigators select groups of patients that have already been treated and analyze data from the events experienced by these patients. These studies are subject to bias because investigators can select patient groups with known outcomes. (Contrast with prospective study.)

Safety: a judgment of the acceptability of risk (a measure of the probability of an adverse outcome and its severity) associated with using a technology in a given situation, e.g., for a patient with a particular health problem, by a clinician with certain training, or in a specified treatment setting.

Sample size: the number of patients studied in a trial, including the treatment and control groups, where applicable. In general, a larger sample size decreases the probability of making a false-positive error (α) and increases the power of a trial, i.e., decreases the probability of making a false-negative error (β). Large sample sizes decrease the effect of random variation on the estimate of a treatment effect.

Sensitivity: an operating characteristic of a diagnostic test that measures the ability of a test to detect a disease (or condition) when it is truly present. Sensitivity is the proportion of all diseased patients for whom there is a positive test, determined as: [true positives ÷ (true positives + false negatives)]. (Contrast with specificity.)

Sensitivity analysis: a means to determine the robustness of a mathematical model or analysis (such as a cost-effectiveness analysis or decision analysis) that tests a plausible range of estimates of key independent variables (e.g., costs, outcomes, probabilities of events) to determine if such variations make meaningful changes the results of the analysis. Sensitivity analysis also can be performed for other types of study; e.g., clinical trials analysis (to see if inclusion/exclusion of certain data changes results) and meta-analysis (to see if inclusion/exclusion of certain studies changes results).

Series: an uncontrolled study (prospective or retrospective) of a series (succession) of consecutive patients who receive a particular intervention and are followed to observe their outcomes. (Also known as case series or clinical series or series of consecutive cases.)

Specificity: an operating characteristic of a diagnostic test that measures the ability of a test to exclude the presence of a disease (or condition) when it is truly not present. Specificity is the proportion of non-diseased patients for whom there is a negative test, expressed as: [true negatives ÷ (true negatives + false positives)]. (Contrast with sensitivity.)

Statistical power: see power.

Statistical significance: a conclusion that an intervention has a true effect, based upon observed differences in outcomes between the treatment and control groups that are sufficiently large so that these differences are unlikely to have occurred due to chance, as determined by a statistical test. Statistical significance indicates the probability that the observed difference was due to chance if the null hypothesis is true; it does not provide information about the magnitude of a treatment effect. (Statistical significance is necessary but not sufficient for clinical significance.)

Statistical test: a mathematical formula (or function) that is used to determine if the difference in outcomes of a treatment and control group are great enough to conclude that the difference is statistically significant. Statistical tests generate a value that is associated with a particular P value. Among the variety of common statistical tests are: F, t, Z, and chi-square. The choice of a test depends upon the conditions of a study, e.g., what type of outcome variable used, whether or not the patients were randomly selected from a larger population, and whether it can be assumed that the outcome values of the population have a normal distribution or other type of distribution. Surrogate endpoint: an outcome measure that is used in place of a primary endpoint (outcome). Examples are decrease in blood pressure as a predictor of decrease in strokes and heart attacks in hypertensive patients, and increase in T-cell (a type of white blood cell) counts as an indicator of improved survival of AIDS patients. Use of a surrogate endpoint assumes that it is a reliable predictor of the primary endpoint(s) of interest.

Synthetic (or integrative) study: a study that does not generate primary data but that involves the qualitative or quantitative consolidation of findings from multiple primary studies. Examples are literature review, meta-analysis, decision analysis, and consensus development.

Systematic review: a form of structure literature review that addresses a question that is formulated to be answered by analysis of evidence, and involves objective means of searching the literature, applying predetermined inclusion and exclusion criteria to this literature, critically appraising the relevant literature, and extraction and synthesis of data from evidence base to formulate findings.

Technological imperative: the inclination to use a technology that has potential for some benefit, however marginal or unsubstantiated, based on an abiding fascination with technology, the expectation that new is better, and financial and other professional incentives.

Technology: the application of scientific or other organized knowledge--including any tool, technique, product, process, method, organization or system--to practical tasks. In health care, technology includes drugs; diagnostics, indicators and reagents; devices, equipment and supplies; medical and surgical procedures; support systems; and organizational and managerial systems used in prevention, screening, diagnosis, treatment and rehabilitation.

Time lag bias: a form of bias that may affect identification of studies to be included in a systematic review; occurs when the time from completion of a study to its publication is affected by the direction (positive vs. negative findings) and strength (statistical significance) of its results.

Treatment effect: the effect of a treatment (intervention) on outcomes, i.e., attributable only to the effect of the intervention. Investigators seek to estimate the true treatment effect using the difference between the observed outcomes of a treatment group and a control group. (See effect size.)

Type I error: same as false-positive error.

Type II error: same as false-negative error.

Utility: the relative desirability or preference (usually from the perspective of a patient) for a specific health outcome or level of health status.

Validity: The extent to which a measure accurately reflects the concept that it is intended to measure. See internal validity and external validity.

Previous Section Next Section Table of Contents NICHSR Home Page

Last reviewed: 08 September 2008
Last updated: 08 September 2008
First published: 18 August 2004
Metadata| Permanence level: Permanent: Stable Content
Previous version