skip to content National Cancer Institute U.S. National Institutes of Health www.cancer.gov
Division of Cancer Prevention logo
Home Site Map Contact DCP
Skip to subnavigation.
Programs & Resources
skip sub-navigation, go to content.

Biometry Research Group

Curriculum Vitae for Vance Berger, PhD

Mathematical Statistician

US Mail Address
Biometry Research Group, DCP
National Cancer Institute
Executive Plaza North, Room 3131
6130 Executive Blvd MSC 7354
Bethesda, MD 20892-7354
Shipping Address
Biometry Research Group, DCP
National Cancer Institute
6130 Executive Blvd Room 3131
Rockville, MD 20852

Phone: 301-435-5303 • Fax: 301-402-0816
E-mail: vb78c@nih.gov

-Clinical Trial Training Internships-

Education

1995 -- Ph.D. (Statistics) Rutgers University, GPA=4.00.
1989 -- M.S. (Statistics) Stanford University, GPA=3.59.
1987 -- B.S. (Biometry) Cornell University, PA=3.52.

Back to TopBack to Top

Employment

10/1999-presentMathematical Statistician
Biometry Research Group, Division of Cancer Prevention
National Cancer Institute, National Institutes of Health Review the design and statistical analysis sections of clinical trial protocols (of preventative agents and cancer vaccines) submitted as grant proposals. Develop novel statistical methodology as needed to support these functions. Supervise the research work of several student interns.
8/1995-10/1999 Mathematical Statistician (CBER, 7/96-10/99)
Staff Fellow (CDER, Biometrics, Oncology, 8/95-7/96)
Food and Drug Administration (FDA)
Reviewed NDAs, PLAs, BLAs, ELAs, and INDs for new vaccines (anthrax, cholera) and therapeutics. Determined if the use of standard biostatistical procedures would lead to inappropriate regulatory decisions and, if so, developed new procedures. Conveyed both orally and in writing complicated biostatistical ideas to FDA colleagues, sponsoring organizations, and several Advisory Committees. Supervised the regulatory and research work of student interns. Helped to write the Good Reviewer Practices Track 4 Document, "Guidelines for Reviewers During Product Development". Designed and taught a class in the design and analysis of clinical trials (offered for CME credit).
2/1995-8/1995Biostatistical Consultant
The Cancer Institute of New Jersey; New Brunswick, NJ
Wrote the design and analysis sections of study protocols for a variety of cancer treatments, as well as for grant applications. Critiqued proposed research at Scientific Review Board meetings.
6/1995-8/1995SAS Consultant
Clinical and Scientific Affairs
Pfizer; New York, NY
Prepared datasets for the integrated safety summary of the azithromycin (anti-infective) NDA by combining data across studies and nations.
11/1991-6/1995Senior Biostatistician
Theradex; Princeton, NJ
Created a new Biostatistics Department. Wrote SOPs for data flow and interactions with internal staff, clients, IRBs, and the FDA. Served as a Project Manager. Designed and analyzed oncology and biotechnology clinical trials. Computed sample sizes; performed randomizations; designed pharmacoeconomics CRFs; wrote statistical and pharmacoeconomics portions of protocols, analysis plans, (final and interim) reports.
2/1990-11/1991Biostatistician
The Janssen Research Foundation (Johnson and Johnson)
Piscataway, NJ
Total project (oncology, cardiology) responsibility included protocol & CRF review, database structuring, writing analysis plans and reports, and directing the work of SAS programmers and data reviewers.
4/1989-1/1990Biostatistician
The Janssen Research Foundation (Johnson and Johnson)
Piscataway, NJ
Total project (oncology, cardiology) responsibility included protocol & CRF review, database structuring, writing analysis plans and reports, and directing the work of SAS programmers and data reviewers.
4/1989-1/1990Statistician
VA Hospital Anesthesiology
Stanford University; Palo Alto, CA
Studied the quantification of depth of anesthesia.
6/1988-6/1989Research Assistant
Stanford Linear Accelerator Complex
Studied ranking and selection methods.
6/1987-9/1987, and 12/1987Statistical Consultant
Marketing Research
Pfizer; New York, NY
Developed a model for AIDS epidemiology.

Back to TopBack to Top

Selected Reseach

Ordered Categorical Data

Within the context of comparative Phase III randomized clinical trials, many safety and efficacy endpoints are measured on an ordinal scale. That is, the data consist of a set of categories, and there is a natural ordering among these categories. The relative spacings among the categories, however, are not known. For example, pain may be measured as severe, moderate, mild, or none. Clearly severe is worse than moderate, which is worse than mild, which is worse than none. Yet it cannot be stated with certainty that moderate is halfway between mild and severe. If it could, then the data would be interval-scaled and not ordinal-scaled.

The goal of a between-group analysis of an ordered categorical endpoint is to establish that one treatment tends to be associated with preferable outcomes compared to the other treatment. Typically, one of two methods will be used. The first is to dichotomize the endpoint into a single "success" category and a single "failure" category. The other approach is to assign numerical scores to the categories for use with a linear rank test. Often these scores are equally spaced, such as 1, 2, 3, 4 for the four pain categories none, mild, moderate, severe. In fact, dichotomization can be achieved through the assignment of scores by assignment of only two scores, 0 for the "failure" categories and 1 for the "success" categories.

Some data sets are such that the treatment effect is sufficiently strong that any set of monotonic scores can be used with a linear rank test, and the resulting p-value will attain statistical significance. However, there are other data sets for which the choice of scores will have a profound influence on the p-value, and on the interpretation of the results. Some researchers will get around this issue by claiming that if the treatment effect is sufficiently robust, then good results should be seen for any choice of scores. However, this raises the hurdle from requiring statistical significance of a valid test of the sponsor's choosing to requiring statistical significance of essentially every test. In other situations this level of evidence is not required to claim a treatment effect, and it seems questionable to require such evidence for only ordered categorical data.

Ivanova and Berger (2001) made the point that no set of scores can be "right" or "wrong" when the data structure is ordinal but not interval. If one has some idea about the relative magnitudes of category shifts that the active treatment will produce compared to the control treatment, then this would form the basis for selecting the scores in an optimal way. In fact, Berger, Permutt, and Ivanova (1998) made explicit the form of the locally most powerful test for any given alternative, and it turns out to always be a linear rank test with some set of scores. However, more often than not the precise form of the knowledge that would be required to make this determination is simply not available, or is not reliable, at the time that one would need to select an analysis. Consequently, one could make a strong case that nonlinear rank tests are preferable to linear rank tests. In fact, Berger and Ivanova (2002) illustrated an example in which any linear rank test, using any set of scores, would necessarily have zero power to detect certain alternatives of interest.

One nonlinear rank test that is both nonlinear enough to offer good global power and available in a standard software package is the exact two-sample Smirnov test (available in StatXact), often confused with the better-known one-sample Kolmogorov-Smirnov. This test is based on the largest difference in empirical CDFs and has a rejection region with piece-wise linear boundaries. The Smirnov test makes use of the ordering structure among the categories, but does not require the assignment of arbitrary numerical scores.

Berger and Sackrowitz (1997) demonstrated that the Smirnov test is not generally admissible, which means that there exists another test at the same alpha level with uniformly better power. Proofs of inadmissibility are not often constructive, but Berger and Sackrowitz (1997) actually illustrated not only the inadmissibility of the Smirnov test, but also methods for constructing every test that is uniformly more powerful than it. However, these constructions can be complicated and difficult to automate.

Permutt and Berger (2000) proposed a modification of the Smirnov test that is more along the lines of the Anderson-Darling test. Berger (1998) proposed the convex hull class of tests, based on directional convex hull peeling of the permutation sample space, all of which are admissible. The simplest of these tests was studied by Berger, Permutt, and Ivanova (1998) under the name "the convex hull test". This test was shown to have good global conditional power properties. We continue to work on tests that are simple to implement and interpret, offer good global power, and would allow one to benefit from prior information without compromising either validity or good global power. Some of the tests we have developed can be run, at least for 2x3 contingency tables, from programs written in S-PLUS by Anastasia Ivanova. For articles and S-PLUS programs, go to http://www.bios.unc.edu/~aivanova/ordinal/.

Reality-Based Statistical Analysis

One of the objectives of my research is to clarify for researchers the need for statistical analyses that do not go beyond the realities of the way in which the data were collected. For example, nothing in the collection of data will tell you that the data have a normal distribution, or that two groups will have equal variances, or that hazards are proportional, or that odds ratios across strata are common. Making these or other assumptions will weaken the subsequent analysis, because the validity of the analysis will be predicated on the truth of the assumption. Put another way, if one absorbs the assumption into the null hypothesis, then rejection of the null hypothesis cannot necessarily be inferred to be a rejection of the intended null hypothesis (typically equality of the treatment groups). Rather, it may be a rejection of the unintended null hypothesis (the assumption or assumptions). Even testing for the assumption(s) will not solve this problem. Micceri (Psychological Bulletin 1989:105(1);156-166) examined numerous distributions for normality and found that "No distributions among those investigated passed all tests of normality, and very few seem to be even reasonably close approximations to the Gaussian. It therefore appears meaningless to test either ability or psychometric distributions for normality, because only weak tests or chance occurrences should return a conclusion of normality. Instead, one should probably heed Geary's (1947) caveat and pretend that 'normality is a myth; there never was, and never will be, a normal distribution' (p. 241)."

Loosely speaking, when an analysis makes use of the data but not any extraneous assumptions, it is exact. Before we can consider which tests are exact, we need to clarify the idea of exactness. The issue is preservation of the Type I error rate, or alpha level. If the actual level is the same as the nominal level, then the test is exact. Otherwise, one could perform a test at level 0.05 (or any other level), and have the actual probability of rejection exceed 0.05 even if the null hypothesis is true. We need to consider why the actual level might not be the same as the nominal level. One example is the chi-square test, applied to compare two binomials. Suppose that in one group there are two successes out of ten, and in the other group there are six successes out of ten. The data can be displayed as a 2x2 contingency table as follows:

Failure Success Total

Control8210
Active4610
Total12820

With fixed margins (ten subjects per group, eight total successes, 12 total failures), there are nine possible outcomes, because the lower-right cell (for example) count can range from zero to eight. Denote the observed outcome as {(8,2),(4,6)}. With this same notation, the set of nine outcomes are:

NumberCell CountsNull
Probability
CumulativeChi-square p
(one-sided)
1{(10,0),(2,8)}0.00040.00040.0001
2{(9,1),(3,7)}0.00950.00990.0031
3{(8,2),(4,6)}0.07500.08490.0339
4{(7,3),(5,5)}0.24010.32500.1807
5{(6,4),(6,4)}0.35010.67500.5000
6{(5,5),(7,3)}0.24010.91510.1807
7{(4,6),(8,2)}0.07500.99010.0339
8{(3,7),(9,1)}0.00950.99960.0031
9{(2,8),(10,0)}0.00041.00000.0001

If the chi-square test is used with one-sided nominal level is 0.035, then each of the first three outcomes would lead to statistical significance (p<0.035). The null probability of this three-point critical region is 0.0849, so the actual significance level is 0.0849, and not 0.035. There is a discrepancy between the hypergeometric null probabilities and the chi-square null probabilities. So which is correct? We answer this question indirectly by first addressing a larger issue, specifically the fact that what constitutes an exact test depends on the context. We consider the objective of comparing two treatment groups, in hopes of establishing the superiority of one over the other, in a randomized clinical trial. In this case, the typical null hypothesis is that the two treatments are equivalent in their ability to bring about a response in a given patient population. This may mean either that the two are equivalent for each patient (the strong null hypothesis) or that for the patients in whom the two treatments differ, they each have the same likelihood of being the better one (the weak null hypothesis). We focus on the strong null hypothesis for reasons articulated by Berger (Statistics in Medicine 2000;19:1319-1328). Now to find the Type I error rate, or the probability of finding a treatment effect when none exists, we must act under the assumption that none exists (the strong null hypothesis). One consequence of the strong null hypothesis is that had another randomization sequence been observed, causing some patients to experience the treatment other than the one they actually experiences, the response would not differ. This means that whatever response categories there may be, there will be the same number of patients (when combining treatment groups) falling into each of these categories, regardless of how the randomization turns out. In the 2x2 contingency table example above, this means that margins are fixed, and the hypergeometric probabilities are "correct".

The chi-square p-values are also correct, but one must consider the question for which they provide the correct answer. This question would be "How likely would one be, when sampling from a chi-square distribution with one degree of freedom, to find a result as extreme as or more extreme than the one actually observed?". This may sound like a compelling question, but compare it to the one for which the hypergeometric (or Fisher's exact test) p-values are correct: "How likely would one be, when hypothetically repeating the experiment (re-randomizing the patients), to find a result as extreme as or more extreme than the one actually observed?". This latter question seems the more relevant, and Fisher's exact test gives the right answer to this question. In general, it turns out that exact tests for this type of question are design-based permutation tests (Berger, Statistics in Medicine 2000:19;1319-1328). Additional information on randomization tests may be found at http://www.okstate.edu/artsci/botany/ordinate/permute.htm or http://www.jiscmail.ac.uk/lists/exact-stats.html.

The above chi-square example clearly illustrates that approximate tests need not preserve the nominal Type I error rate, but in this example the inflation was of a magnitude that might not matter to some researchers. As such, it may be a fair question to ask how far off the mark an approximate test could be relative to an exact test. Berger (Statistics in Medicine 2000;19:1319-1328) provides a partial answer to this question, in the form of an example of a real data set, with large numbers of patients in each treatment group. Specifically, the data are ordered categorical, with three categories (confirmed reinfarction, unconfirmed reinfarction, no reinfarction). In the placebo group, the numbers of patients falling into each of these three groups were (33,5,545). In the sotalol group, these numbers were (29,8,836). With over 550 patients per treatment group, one might expect the approximate test to be close to the exact one. Yet when using the Smirnov test, the two-sided exact (subject to very small Monte Carlo sampling error) p-value is 0.0485. Yet the approximate two-sided Smirnov p-value is 0.9910. This example underscores the importance of using exact tests, even when the need to do so is not immediately obvious. Hays (Statistics, 3rd Edition, Holt, Rinehart, and Winston, New York, 1973) essentially echoes this concern, but geared it towards normality, when asserting the "assumption of random sampling is not to be taken lightly ... Unless this assumption is at least reasonable, the probability results of inferential methods mean very little and these [normal curve] methods might as well be omitted" (p. 197). Suffice it to say that the alleged robustness of parametric analyses may be questioned (Hunter and May, Canadian Psychology 1993;34:384-389).

With state-of-the-art computing power and algorithms, it is not only possible, but also easy to perform some good design-based permutation tests. However, this was not always the case, and this is one reason why approximate tests caught on to the extent that they did. They bridged a gap between what was desirable and what was possible. That gap is narrowing, and the niche once filled by approximate tests is disappearing.

Back to TopBack to Top

Selection Bias in Randomized Clinical Trials

Selection bias is often given as one of the best reasons why results from randomized clinical trials are more reliable than results from nonrandomized studies. Specifically, when the study is not randomized, there are mechanisms that can lead to qualitatively different types of patient populations in the treatment groups to be compared. For example, without randomization, healthier patients may systematically be given one treatment while sicker patients are systematically given the other treatment. In this situation, demonstrating a between-group difference after treatment does not establish a treatment effect, because the treatment groups were different to start with. This is analogous to running a race in which the runners have different starting points (one has a head start).

One fundamental benefit of randomization is that it makes it more difficult for selection bias to occur, and helps to level the playing field (at least at baseline). However, what is less widely recognized is that selection bias can occur and does occur even in randomized clinical trials. The mechanism for selection bias in a randomized clinical trial is more subtle, and related to allocation concealment, or the lack thereof. Allocation concealment is essentially the absence of any ability to predict the treatment to be assigned (Schulz, Canadian Medical Association Journal 1995;153(6):783-786). Without allocation concealment, it is possible to base enrollment decisions on the combination of the treatment expected to be assigned and a general assessment of the ability of the patient to respond to treatment. This can lead to healthier patients being assigned to one treatment group and sicker patients assigned to the other treatment group. The resulting selection bias can be sufficient to drastically inflate the chances that the active treatment is found superior to the control treatment, even in the absence of a real benefit (Proschan, Statistica Sinica 1994;4:219-231). Clearly, the patients who rely on the results of clinical trials for their medical decisions are entitled to reliable results. As such, steps need to be taken to ensure that selection bias does not compromise the integrity of clinical trials. In fact, Chalmers (Biometrics 1990;46:20-22) was "convinced that the most essential requirement for a good clinical trial must be that the clinician who is deciding whether a patient should be randomized must have absolutely no clue as to which treatment is more likely to be selected. There are too many loopholes in the criteria for admission or rejection of patients before randomization, and too great an opportunity for physicians to project their doubts to the patient when seeking to obtain informed consent."

Selection bias cannot occur if investigators do not allow it to happen. However, there is evidence that at least some investigators will engage in selection bias if they are allowed to (Schultz, JAMA 1995;274(18):1456-1458). Moher et al. (Lancet 1998;352:609-613) found that clinical trials without adequate allocation concealment produced larger estimates of treatment effects by 37% than those with adequate allocation concealment. Kunz and Oxman (British Medical Journal 1998;317:1185-1190) confirmed this potential, but also pointed out that sometimes the effect is reversed. The potential for selection bias in specific studies, some well-known, was discussed by Ellenberg et al. (New England Journal of Medicine 1994;331(3):203-205), Schulz (Canadian Medical Association 1995:153(6):783-786), and Bailar and MacMahon (Canadian Medical Association 1997;156(2):193-199).

This problem may be quite wide-spread, because many clinical trials claim to be masked (which would ensure allocation concealment), yet are not (Moscucci et al., Clinical Pharmacology and Therapeutics 1987;41(3):259-265; Greenberg and Fisher, Controlled Clinical Trials 1994;15:244-246; Ney, Canadian Medical Association Journal 140, 15; Basoglu et al.,Archives of General Psychiatry1997;54:744-748). See also Day (Encyclopedia of Biostatistics, Volume 1, Armitage and Colton, Editors, John Wiley and Sons, Chichester, 1998, Blinding or Masking, 410-417) for more information on the potential ineffectiveness of masking procedures.

As an example of a published clinical trial that appears to provide strong evidence in favor of the active treatment, consider a recent study (Lovell et al., New England Journal of Medicine 2000;342(11):763-769) of etanercept for children with juvenile rheumatoid arthritis. There were statistically significant imbalances at baseline between the two treatment groups, as mentioned in the publication.

We see that patients in the etanercept group were younger (p=0.0026) and less likely to be Caucasian (p=0.022), and had a lower mean weight (p=0.027), than patients in the placebo group. The low p-values make it unlikely that these differences were due to chance alone. The publication makes no attempt to explain these baseline differences, but more information is provided in the FDA review. Specifically, we find that the three-month open-label run-in on etanercept increases the likelihood of unmasking treatment allocations. In addition, the randomization was performed with blocks of size two, the worst situation for selection bias (Proschan, Statistica Sinica 1994;4:219-231). But to make matters even worse, these blocks had relations among them (mirror image across strata). These design flaws were not mentioned in the publication. The FDA investigation into selection bias revealed that some patients were randomized out of order. Also, four patients were randomized from the wrong strata, and in two cases, this reversed the treatment received. The result that etanercept lowers the flare rate (p=0.003) presented in the published article may seem less impressive when one considers that switching as few as three patients from one group to the other could break the observed significance. See Berger (The Statistician 2001;50(1):79-85) for more information about this type of calculation.

This example illustrates the potential for selection bias to lead to misleading results, which in turn drive medical decisions. If we cannot trust investigators to police themselves, then we need to seek other methods to ensure that clinical trials remain free from selection bias, or at least that reporting is accurate so that readers recognize when selection bias is an issue. Berger and Exner (Controlled Clinical Trials 1999;20:319-327) suggest testing for unobserved selection bias following a randomized clinical trial by examining, within each treatment group, the predictive ability, for the response variable, of the expected (by the investigator) likelihood of a patient to receive the active treatment. This approach compliments testing for baseline comparability (which detects observable selection bias) because it can detect selection bias even when none of the measured baseline variables are imbalanced. If data are collected on patients who are screened but not randomized, then one can also study the joint relationship among the baseline covariates, the expected (by the investigator) likelihood of a patient to receive the active treatment, and the decision to randomize a patient or not. Berger and Exner (Controlled Clinical Trials 1999;20:319-327) used the availability of this type of information to demonstrate the potential for selection bias in the Coronary Artery Surgery Study (CASS Investigators, Journal of the American College of Cardiology 1984;3:114-128).

Unfortunately, the information required to detect selection bias is rarely available from publications. As such, there is no way to estimate how prevalent or extreme it may be. The quality of clinical trials would be greatly improved if the complete data were generally made available, perhaps in a web site, as a condition of publication (Hutchon, British Medical Journal 2001;322:530). For detecting selection bias, one would ideally need a log of dates and times that each patient or subject was screened, whether or not they were randomized (with a reason if they were not randomized). Also, one would need to know that allocation sequence used, as well as the allocation procedure (the list of allowable allocation lists, with the probability of each) that was used to generate this allocation sequence. Finally, one would need the relevant baseline and post-treatment outcome data of each patient or subject randomized, and preferable for each patient or subject screened (whether or not they were randomized). From this basic information one should be able to determine the treatment assignments for each patient or subject. However, because there may be errors in the randomization (as in the etanercept study mentioned above), it is important that this information be provided as well.

While the detection of selection bias is an important issue, prevention and adjustment are also important issues. In their Discussion and Recommendations Section, Berger and Exner (Controlled Clinical Trials 1999;20:319-327) provide some suggestions for preventing selection bias (Points #1, #3, #4, and #5). In Point #7, they offer suggestions for adjusting for selection bias, if it is found, because there may be a way to salvage reliable information even in the presence of selection bias. If selection bias is found, then the sanctity of the randomization is in doubt, and the popular intent-to-treat approach to between-group analysis may lose some of its appeal, not because it goes too far in adhering to the randomization, but rather because it does not go far enough. One might consider a modification suggested by Berger and Exner (Controlled Clinical Trials 1999;20:319-327), in which one would conduct a permutation test, but building in selection bias, so that the null distribution of p-values (to which the observed p-value is to be compared, as the observed p-value now serves as only a test statistic) is skewed towards lower values. Another modification, which might be called the "intent-to-randomize" approach, would extend consideration to not only the actual randomization but also the intended randomization.

More work is needed in the prevention of, detection of, and adjustment for selection bias in clinical trials. One area is to develop a competitor to the randomized blocks procedure, on the grounds that the randomized blocks procedure allows for so much prediction of future allocations based on past ones. Another area is the documentation of selection bias, when it is possible to do so. Without this, many may consider selection bias to be only a hypothetical concern, or to exist only in the realm of small or non-randomized studies.

Back to TopBack to Top

Adjunct Teaching

1/2003-presentHealth Care Policy
Columbia Union College
Gaithersburg, MD
9/2000-presentDepartment of Mathematics
University of Maryland Baltimore County (UMBC)
Baltimore, MD
Directed Judie Zhou's PhD dissertation on partial orderings 7/02
Promoted to Affiliate Associate Professor 3/03
11/1997-9/1998Department of Biostatistics
MPH and Biotechnology Programs
Johns Hopkins University
Baltimore, MD
6/1995-8/1995SAS Consultant
Clinical and Scientific Affairs
Pfizer; New York, NY
Prepared datasets for the integrated safety summary of the azithromycin (anti-infective) NDA by combining data across studies and nations.
1/1994-4/1994School of Business
MBA Program
Rutgers University
New Brunswick, NJ

Back to TopBack to Top

Honors, Awards, and Scholarships

Gertrude Cox Award (for distinguished contributions by a statistician in mid-career), Research Triangle Park and Washington Statistical Society, 2006.

Elected Member, International Statistics Institute, 2006.

Awarded an FDA (CBER) Group Recognition Award for efficient and thorough review work, 1998.

Awarded an FDA (CBER) Recognition Award for extraordinary contributions to the evaluation of anthrax vaccine, 1998.

Awarded FDA Center Director's Awards for research from each of CBER and CDER, 1998.

Awarded an FDA (CBER) On-the-Spot Award Certificate for reviews, research, and instituting an internship program, 1997.

Awarded an FDA (CBER) Certificate of Appreciation for designing and teaching a biostatistics class, 1997.

Honorable Mention, National Science Foundation Graduate Fellowship, 1987.

Received the Hoover Award for top freshman mathematics students at Cornell University, 1984.

Back to TopBack to Top

Referee Service

  • American Journal of Epidemiology
  • American Statistician
  • Annals of Statistics
  • Archives of Internal Medicine
  • Bio Med Central Anesthesiology
  • Bio Med Central Cancer
  • Biometrics
  • Clinical Trials
  • Controlled Clinical Trials
  • Journal of the American Medical Association (JAMA)
  • Journal of the American Statistical Association (JASA)
  • Journal of Biopharmaceutical Statistics Journal of Clinical Epidemiology
  • Journal of Clinical Oncology
  • Journal of the National Cancer Institute (JNCI)
  • Journal of Nonparametric Statistics
  • Journal of Statistical Planning and Inference (JSPI)
  • Journal of Urology
  • Lifetime Data Analysis
  • Medical Decision Making
  • Metron
  • Ophthalmology
  • Pan American Journal of Public Health
  • Statistical Science
  • Statistics and Decisions
  • Statistics and Probability Letters
  • Statistics in Medicine

Back to TopBack to Top

Editorial Boards

2008 -- Guest Editor, Special Issue of Statistical Methods in Medical Research on "Controversies in Statistics"

2004-present -- Biometrical Journal

2001-present -- Assistant Editor,Journal of Modern Applied Statistical Methods

Back to TopBack to Top

Reviewer for Book Proposals

John Wiley & Sons
Marcel Dekker
Springer-Verlag

Back to TopBack to Top

Reviewer for Protocols

Cochrane Collaboration Protocol NIAID External Scientific Review Board to discuss a proposed HIV prevention protocol. May 15, 2003

Back to TopBack to Top

Grant Support

Awarded, as Principal Investigator (with Clare Gnecco, Ph.D., Co-Investigator), FDA Center for Drug Evaluation and Research (CDER) Grant #RSR-96-004A ($10,500) to study the analysis of objective tumor response data, 1996.

Awarded, as Principal Investigator (with Steven Hirschfeld, M.D., Ph.D., Co-Investigator), Food and Drug Administration Office of Women's Health funding ($145,000) to improve the biostatistical analysis of ovarian cancer data, 4/99.

Back to TopBack to Top

Bibliography

Publications in Peer-Reviewed Journals:

Hrabinski D, Hertz JL, Tantillo C, Berger V, Sherman AR. Iron repletion attenuates the protective effects of iron deficiency in DMBA-induced mammary tumors in rats.Nutrition and Cancer 1995;24(2):133-142.

Berger V, Sackrowitz H. Improving tests for superior treatment in contingency tables. Journal of the American Statistical Association 1997;92(438):700-705.

Berger V. Admissibility of exact conditional tests of stochastic order. Journal of Statistical Planning and Inference1998;66(1):39-50.

Berger VW, Permutt T, Ivanova A. The convex hull test for ordered categorical data. Biometrics 1998;54(4):1541-1550.

Berger V, Song J. Randomized clinical trials for the study of traditional Chinese medicine.Bio/Pharma Quarterly of the Society of Chinese Bioscientists in America 1998;4(2):2-4.

Berger VW. Exner DV. Detecting selection bias in randomized clinical trials. Controlled Clinical Trials 1999;20:319-327.

Weerahandi S, Berger VW. Exact inference for growth curves with intraclass correlation structure. Biometrics 1999;55(3):921-924.

Berger VW. Critique of Ludbrook and Dudley.American Statistician 2000;54(1):85-86.

Berger VW. Pros and cons of permutation tests in clinical trials. Statistics in Medicine 2000;19:1319-1328.

Permutt T, Berger VW. Rank tests in ordered 2xk contingency tables. Communications in Statistics, Theory and Methods 2000;29(5):989-1003.

Berger VW. The p-value interval as an inferential tool. Journal of the Royal Statistical Society D2001;50(1):79-85.

Honkanen VEA, Siegel AF, Szalai JP, Berger VW, Feldman BM, Siegel JN. A three-phase clinical trial design for rare disorders. Statistics in Medicine 2001;20:3009-3021.

Ivanova A, Berger VW. Drawbacks to integer scores for ordered categorical data.Biometrics2001;57:567-570.

Berger VW. Ivanova A. Adaptive tests for ordinal data. Journal of Modern Applied Statistical Methods 2002;1(2):269-280.

Berger VW, Ivanova A. Bias of linear rank tests for stochastic order in ordered categorical data. Journal of Statistical Planning and Inference 2002;107(1):237-247.

Berger VW. Lunneborg C, Ernst MD, Levine JG. Parametric analyses in randomized clinical trials. Journal of Modern Applied Statistical Methods 2002;1(1):74-82.

Berger VW. Improving the information content of categorical clinical trial endpoints. Controlled Clinical Trials 2002;23(5):502-514.

Berger VW. Additional reflections on significance testing. Journal of Modern Applied Statistical Methods 2003;2(2):514-515.

Berger VW. Bears J. When can a clinical trial be called 'randomized'? Vaccine 2003;21:468-472.

Berger VW, Christophi CA. Randomization technique, allocation concealment, masking, and susceptibility of trials to selection bias. Journal of Modern Applied Statistical Methods 2003;2(1):80-86.

Berger VW, Ivanova A, Knoll MD. Minimizing predictability while retaining balance through the use of less restrictive randomization procedures. Statistics in Medicine 2003;22(19):3017-3028.

Berger VW. Rezvani A, Makarewicz V. Direct effect on validity of response run-in selection in clinical trials.Controlled Clinical Trials 2003;24(2):156-166.

Berger VW. Further analysis of the Gallin Itraconazole Trial. New England Journal of Medicine 2003;349(12):1190.

Berger VW. Does the Prentice criteria validate surrogate endpoints? Statistics in Medicine 2004;23(10):1571-1578.

Berger VW. On the generation and ownership of alpha in medical studies. Controlled Clinical Trials 2004;25(6):613-619.

Berger VW. Selection bias and baseline imbalances in randomized trials. Drug Information Journal2004;38:1-2.

Berger VW. Valid adjustment for binary covariates of randomized binary comparisons.Biometrical Journal 2004;46(5):589-594.

Berger VW, Ioannidis JPA. The Decameron of poor research. British Medical Journal 2004;329:1436-1440.

Berger VW, Weinstein S. Ensuring the comparability of comparison groups: is randomization enough? Controlled Clinical Trials 2004;25(5):515-524.

Berger VW, Zhou YY, Ivanova A, Tremmel L. Adjusting for ordinal covariates by inducing a partial ordering. Biometrical Journal 2004;46(1):48-55.

Fuchs C, Berger VW.Quantifying the proportion of cases attributable to an exposure. Journal of Modern Applied Statistical Methods 2004;3(1):54-64.

Berger VW. A novel criterion for selecting covariates. Drug Information Journal 2005;39:233-241.

Berger VW. A review of methods for ensuring the comparability of comparison groups in randomized trials. . Reviews on Recent Clinical Trials 2005;1(1):81-86.

Berger VW. Allocation concealment and blinding: when ignorance is bliss. Medical Journal of Australia2005;183(3):165.

Berger VW. Nonparametric adjustment techniques for binary covariates. Biometrical Journal 2005;47(2):199-205.

Berger VW. Quantifying the magnitude of baseline covariate imbalances resulting from selection bias in randomized clinical trials (with discussion). Biometrical Journal 2005;47(2):119-127.

Berger VW. The reverse propensity score to manage baseline imbalances in randomized trials.Statistics in Medicine 2005;24:2777-2787.

Berger VW. Training statisticians to be alert to the dangers of mis-applying statistical methods when they do not apply. Journal of Modern Applied Statistical Methods 2005;4(2):587-590.

Berger VW. What is to be done about selection bias.What is to be done about selection bias. 2005;47(2):136-139.

Berger VW, Durkalski VL. Analysis of trichotomous pharmaceutical endpoints. Journal of Biopharmaceutical Statistics2005;15:739-745.

Berger VW. Hsieh G. Rethinking statistics: basing efficacy alpha levels on safety data in randomized trials. Israeli Journal of Emergency Medicine 2005;5(3):55-60.

Berger VW. Matthews JR. Conducting today's trials by tomorrow's standards. Pharmaceutical Statistics 2005;4:155-159.

Berger VW, Semanick L. Refining the assessment of the sensitivity and specificity of diagnostic tests, with applications to prostate cancer screening and non-small cell lung cancer staging.Pan American Journal of Public Health2005;18(1):64-70.

Ivanova A, Barrier R, Berger VW. Adjusting for selection bias in randomized trials. . Statistics in Medicine 2005;24:1537-1546.

Berger VW. Can pain be quantified numerically? J Rheumatol 2006;33(11):2364.

Berger VW. Is the Jadad score the proper evaluation of trials? J Rheumatol 2006;33(8):1710-1712.

Berger VW. Misguided precedent is not a reason to use permuted blocks. Headache 2006;46(7):1210-1212.

Berger VW. Response to Klassen et al: Missing data should be more heartily penalized. Journal of Clinical Epidemiology 2006;59(7):759-760.

Berger VW. Missing data should be more heartily penalized. Journal of Clinical Epidemiology 2006;59:759-760.

Berger VW, Stefanescu C. The analysis of stratified 2x2 contingency tables. Biometrical Journal 2006;48(6):992-1007.

Dores GM, Chang S, Berger VW, Perkins SN, Hursting SD, Weed DL. Evaluating research training outcomes: experience from the Cancer Prevention Fellowship Program at the National Cancer Institute. Academic Medicine 2006;81(6):535-541.

Back to TopBack to Top

Books, Book Chapters, and Book Reviews:

Berger VW. (ed). Testing for Stochastic Order in Contingency Tables. Ann Arbor, MI:UMI Dissertation Services, 1995.

Berger VW. (ed). The Probability Problem Solver. Piscataway, NJ:Research and Education Association, 1996.

Berger VW. Ivanova A. Permutation tests for phase III clinical trials. In Millard SP, Krause A (eds).Applied Statistics in the Pharmaceutical Industry with Case Studies Using S-PLUS. New York:Springer-Verlag, 2001;349-374.

Berger VW, (ed). Selection Bias and Covariate Imbalances in Randomized Clinical Trials. Chichester:John Wiley & Sons, 2005.

Berger VW. Permutation, parametric, and bootstrap tests of hypotheses. In Good P (ed). Statistics in Medicine, Third Edition 2006; in press.

Berger VW. Dicing with death: chance, risk, and health. In Senn S (ed). Computational Statistics 2006; in press.

Back to TopBack to Top

In Press Publications in Peer-Reviewed Journals:

Berger VW. A singularly uninformative study. Journal of Clinical Epidemiology 2006; in press.

Back to TopBack to Top

Work Submitted to Peer-Reviewed Journals and in Progress:

Berger VW, Durkalski V, Wunderink K. Re-formulating equivalence trials as superiority trials. Submitted 2/06.

Berger VW, Matthews JR, Grosch EN. On improving research methodology in medical studies. Submitted 2/06.

Back to TopBack to Top

Entries in Everitt B, Howell D (eds). The Encyclopedia of Behavioral Statistics. Chichester: John Wiley & Sons, 2005:

Berger VW. A Priori v Post Hoc Testing. 2005;1:1-5.

Berger VW. Analysis of Covariance: Nonparametric. 2005;1:50-52.

Berger VW. Block Random Assignment. 2005;1:165-167.

Berger VW. Censored Observations. 2005;1:243-244.

Berger VW. Generalizability. 2005;2:702-704.

Berger VW. Mid-P Values. 2005;3;1221-1223.

Berger VW, Durkalski V. Paradoxes. 2005;3:1511-1517.

Berger VW, Li Z. Paired Observations, Distribution-Free Methods. 2005;3:1505-1509

Liu L, Berger VW. Two-by-Two Contingency Tables. 2005;4:2076-2081.

Berger VW, Liu L, Hershberger S. Trend Tests. 2005;4:2063-2066.

Berger VW, Liu L, Thach C. Matching. 2005;3:1154-1158.

Berger VW, Shafran R. Historical Controls. 2005;2:821-823.

Berger VW, Zhang JL. Case-Control Studies. 2005;1:206-207.

Berger VW, Zhang JL. Marginal Independence. 2005;3:1126-1128.

Berger VW, Zhang JL. Models for Matched Pairs. 2005;3:1253-1256.

Berger VW, Zhang JL. Simple Random Sampling. 2005;4:1840-1841.

Berger VW, Zhang JL. Stratification. 2005;4:1902-1905.

Berger VW, Zhang JL. Structural Zeros. 2005;4:1958-1959.

Berger VW, Zhou YY. Adaptive Random Assignment. 2005;1:10-13.

Berger VW, Zhou YY. Binomial Distribution: Estimating and Testing Parameters. 2005;1:155-157.

Berger VW, Zhou YY. Kolmogorov-Smirnov Tests. 2005;2:1023-1026.

DePuy V, Berger VW. Counterbalancing. 2005;1:418-420.

DePuy V, Berger VW, Zhou YY. Wilcoxon-Mann-Whitney Test. 2005;4:2118-2121.

Durkalski V, Berger VW. Categorizing Data. 2005;1:239-242.

Goldsmith LJ, Berger VW. Intrinsic Linearity. 2005;2:954-955.

Goldsmith LJ, Berger VW. Nonlinear Models. 2005;3:1416-1419.

Goldsmith LJ, Berger VW. Polynomial Model. 2005;3:1555-1557.

Li Z, Berger VW. Adaptive Sampling. 2005;1:13-16.

Liu L, Berger VW. Randomized Block Design: Nonparametric Analysis. 2005;4:1681-1686.

Stefanescu C, Berger VW. Hershberger S. Probits. 2005;4:1608-1610.

Stefanescu C, Berger VW. Hershberger S. Yates' Correction. 2005;4:2127-2129.

Thach C, Berger VW. Simple Random Assignment. 2005;4:1838-1840.

Wang M, Berger VW. The Breslow-Day Test. 2005;1:184-186.

Back to TopBack to Top

Entries in D'Agostino R, Sullivan L, Massaro J (eds). The Encyclopedia of Clinical Trials. Chichester:John Wiley & Sons, 2006:

Berger VW, Grant WC, Dupin-Spriet T, Chappel R. Randomization.

Berger VW, Durkalski V. Run-ins.

Back to TopBack to Top

Abstracts, Dissertations, Proceedings, Technical Reports, and Electronic Letters:

Berger VW. Ordered alternatives for multinomial distributions The IMS Bulletin1994;23(3):357, 94t-10.

Berger VW. Testing for Stochastic Order in Contingency Tables. Ann Arbor:UMI Dissertation Services, 1995.

Berger VW. Gnecco C, Chi G, Hirschfeld S. Perspectives on tumor response and other ordinal data. Proceedings of the Biopharmaceutical Section of the American Statistical Association 1996; 112-117.

Berger VW. Attributing different treatment outcomes to differences in treatments. Proceedings of the Biopharmaceutical Section of the American Statistical Association1997; 144-149.

Berger VW. Publishing raw categorical data in full. Electronic Response to Infopoints: Publishing Raw Data and Real Time Statistical Analysis on E-Journals. BMJ.COM 2001;322

Berger VW. Not all clinical trials are created equal. Electronic Response to Infopoints: Any Casualties in the Clash of Randomised and Observational Evidence? BMJ.COM 2001;322.

Berger VW. Yi E, Edukat L. Stratifying for variables influenced by treatment. Electronic Response to Infopoints: Assessing of Grouping Variable Should Have Been Blind in Trial of Dementia. BMJ.COM 2001; 322.

Berger VW. In defense of hypothesis testing. Electronic Response to Infopoints: Likelihood Ratios Are Alternatives to P-Values. BMJ.COM 2001; 322.

Back to TopBack to Top

Round-table Luncheon Discussions and Invited Tutorials and Workshops:

12/9/98 --

Clinical Trial Design and Permutation Tests;
Deming Conference on Applied Statistics; Atlantic City, NJ.

8/9/99 --Discussion of Permutation Tests;
Lunch #M10, Joint Statistical Meetings; Baltimore, MD.
5/12/02 --Selection Bias in Randomized Clinical Trials,
Society of Clinical Trials Meeting, Arlington, VA.
12/02/02 --Managing Baseline Imbalances in Clinical Trials;
Deming Conference on Applied Statistics; Atlantic City, NJ.
7/09/04 --Overview of Biostatistical Principles;
Fogarty Center Lecture; Bethesda, MD.
7/20/05 --Overview of Biostatistical Principles;
Fogarty Center Lecture; Bethesda, MD.
9/15/05 --Managing Baseline Imbalances in Randomized Trials;
FDA/Industry Statistics Workshop; Washington, DC.
11/18/05 --Managing Baseline Imbalances in Randomized Trials;
CBER (FDA) Clinical Trials Training; Rockville, MD.

Back to TopBack to Top


Selected Invited Seminars Since 1996:

The Convex Hull Test for Ordered Trichotomous Data

8/06/96 --Special Contributed Session #164 (FDA Special Session),
Joint Statistical Meetings (JSM); Chicago, IL
10/11/96 --Division of Statistics, Department of Mathematics,
University of Virginia; Charlottesville, VA
11/24/97 --Biostatistics Seminar, Division of Biostatistics,
Yale Medical School; New Haven, CT

Pros and Cons of Permutation Tests in Clinical Trials

8/12/97 --Special Contributed Session #133 (FDA Special Session),
Joint Statistical Meetings; Anaheim, CA
3/31/99 --Invited Session #62,
Permutation Tests and Clinical Trials (Dan Zelterman),
ENAR; Atlanta, GA
10/27/04 --Statistical Techniques in Clinical Trials,
Henry Stewart, Gaithersburg, MD.

Detecting, Preventing, & Correcting for Selection Bias in Randomized Clinical Trials

6/19/99 --

Invited Session #I6 (Mei Ling Lee),
ICSA Applied Statistics Symposium; Washington, DC

8/20/99 --

National Institute of Occupational Safety and Health (NIOSH);
Morgantown, WV.

12/1/00 --Statistics Seminar, Department of Mathematics,
University of Maryland Baltimore County; Baltimore County,MD
6/02/00 --Invited Session #7 (Greg Campbell),
ICSA Applied Statistics Symposium; Piscataway, NJ.
3/29/01 --National Institute of Environmental Health Sciences;
Research Triangle Park, NC.
8/09/01 --Invited Session #200207, Selection Bias in RCTs,
Joint Statistical Meetings (JSM); Atlanta, GA.
4/03/02 --Special Contributed Session #195,
Joint Statistical Meetings (JSM); New York, NY.
8/13/02 --Emmes Corporation Seminar Series; Rockville, MD.
6/13/05 --Invited Session #3C, Clinical Trial Design,
ICSA Applied Statistics Symposium; Bethesda, MD.
9/16/05 --FDA/Industry Statistics Workshop; Washington, DC.

Smaller Studies from Combining Endpoints

6/29/99 --Smaller Studies (Peter A. Lachenbruch),
DIA 35th Annual Meeting; Baltimore, MD.

A New Look at Equivalence Studies

12/08/03 --Solving the Regulatory and Statistical Issues in Control Group Trials,
Henry Stewart, Washington, DC.

Direct Effect of Response Run-in Selection on the Validity of Randomized Trials

6/15/04 --

Randomization Issues in Clinical Trials (Suli Verjee),
DIA 40th Annual Meeting; Washington, DC.

Nonparametric Covariate Adjustment

10/28/04 --Statistical Techniques in Clinical Trials,
Henry Stewart, Gaithersburg, MD.
9/16/05 --FDA/Industry Statistics Workshop; Washington, DC.

Back to TopBack to Top

Clinical Trial Training Internships

The good news is that student internships are available. The bad news is that there are no funds available for stipends, payments, or reimbursements for any travel costs. Because of the lack of funds, every effort has been made to make this internship attractive to at least some applicants. So the interns are treated as scholars and researchers, and are not given any clerical work to do (as in, they may enter their own data or make their own photocopies, but they would not be asked to do these activities for others). Interns are free to attend any of the lectures in or around the NIH and to discuss collaboration with the lecturers (or with other NIH researchers). Library services are available to interns, and there is a free gym as well. The hours are flexible (to allow interns to pursue other paying positions), as is the nature of the work itself. That is, interns can, to some extent, define their own research projects, especially to make them more compatible with a doctoral dissertation. The interns join me in conducting research, and preparing articles for publication. As such, the interns can expect to be named as co-authors on any article to which they make a contribution. Specific activities to support research projects would involve some or all of the following:

  • library searches
  • on-line searches
  • retrieving, reading, and summarizing published articles
  • synthesizing material into an article we prepare for publication
  • data analysis
  • computer simulations
  • preparation of tables and figures
  • delivery of lecturess

Interns may get involved in the review of grant applications and/or other non-methodological activities, such as training medical fellows. However, this is a clinical trials training internship, and so methodological research is the primary focus. It is best to discuss potential research projects during the application process itself, because the proposed project will be considered in the selection process. It would be helpful to look through some of my papers, either to find areas I am working in (if an interest is expressed in one of these areas, then I may be able to suggest a research project) or to begin to try to connect your own research to mine. So if nothing else, then it would be helpful to at least specify which of my papers interest you (or, if you need copies of any papers, then let me know that too). Also, while a phone interview will suffice, if you are local, or just happen to be visiting the DC Area, then it is best if you can interview in person.

Because the internship is unpaid, it certainly is not for everyone. But the internship has been quite productive for many interns in the past (many of whom are willing to talk to potential interns about their experiences), and should continue to be mutually beneficial to others in the future. For those who cannot afford to live in this area without payment, please bear in mind that it is possible to collaborate from a distance, using e-mail, faxes, and phone conversations. Therefore, I would welcome inquiries even from those who cannot spend a few months in the DC Area but might still like to collaborate on research projects of mutual interest.

Even though the internship is unpaid, it is still competitive, and we generally receive applications from many more qualified applicants than we can possibly hire. So here are a few tips for ensuring that your application receives the most favorable consideration. This is a research position. Telling me how much you want to work with me is disingenuous if you know nothing about my research (and, by extension, know nothing about the research work you would be doing). Check my CV for a complete list of my publications. This is not to say that I expect you to read all of my papers, or even most of them, but there has to be a reason that you want to work with me, so you should have read at least one of them. Please specify which articles you have read and why you would like to conduct similar research, or research that builds on this earlier work.

There are other excellent papers and books (not mine) that are also worth reading. These include the following:

Altman DG. The scandal of poor medical research. British Medical Journal 1994;308(6924):283-284.

Bross ID. How to eradicate fraudulent statistical methods: statisticians must do science. Biometrics 1990;46(4):1213-1225.

Edgington ES (ed). Randomization Tests. New York:Marcel Dekker, 1995.

Feigenbaum S, Levy DM. The technological obsolescence of scientific fraud. Rationality and Society 1996;8(3):261-276.

Feinstein AR. Permutation tests and statistical significance. MD Computing 1993;10(1):28-41.

Forsman B. An ethical analysis of the phenomenon of misconduct in research. Acta Oncologica 1999;38(1):107-110.

Greenhouse JB, Stangl D, Kupfer DJ, Prien RF. Methodologic issues in maintenance therapy clinical trials. Archives of General Psychiatry 1991;48(4):313-318.

Leber PD, Davis CS. Threats to the validity of clinical trials employing enrichment strategies for sample selection. Controlled Clinical Trials 1998;19(2), 178-187.

Lindley DV. The role of exchangeability in inference. Annals of Statistics 1981;9(1):45-58.

Little RJA. Testing the equality of two independent binomial proportions. The American Statistician 1989;43(4):283-288.

Moore, T (ed). Deadly Medicine. New York:Harper Collins, 1995.

Moses LE. Measuring effects without randomized trials: options, problems, challenges. Medical Care 1995;33(4):AS8-AS14.

Moye LA. End-point interpretation in clinical trials: the case for discipline. Controlled Clinical Trials 1999;20(1):40-49.

Oxtoby A, Jones A, Robinson M. Is your double-blind design truly double-blind? British Journal of Psychiatry 1989;155:700-701.

Senn S. Fisher's game with the devil. Statistics in Medicine 1994;13(3):217-230.

Senn SJ. Falsificationism and clinical trials. Statistics in Medicine 1991;10(11):1679-1692.

Tukey JW. Tightening the clinical trial. Controlled Clinical Trials 1993;14(4):266-285.

Wieand S, Murphy K. A commentary on treatment at random: the ultimate science or the betrayal of Hippocrates? Journal of Clinical Oncology 2004;22(24):5009-5011.

Directions to 6130 Executive Boulevard

Metro
Take the metro red line to either Twinbrook (and walk to Executive Plaza) or White Flint (and catch the shuttle to Executive Plaza).

Driving
Take 95 to 495W to 270N to Exit 4, Montrose Road East, right at the third light onto Executive Boulevard, right at the first light into Executive Plaza.

Map

Back to TopBack to Top

This page was last updated July 31, 2007