Statistical Standards Program
Table of Contents Introduction 1. Development of Concepts and Methods 2. Planning and Design of Surveys 3. Collection of Data 4. Processing and Editing of Data 5. Analysis of Data / Production of Estimates or Projections 5-1 Statistical Analysis, Inference, and Comparisons 5-2 Variance Estimation 5-3 Rounding 5-4 Tabular and Graphic Presentations of Data 6. Establishment of Review Procedures 7. Dissemination of Data Glossary Appendix A Appendix B Appendix C Appendix D Download PDF (448KB) For help viewing PDF files, please click here |
ANALYSIS OF DATA / PRODUCTION OF ESTIMATES OR PROJECTIONS |
SUBJECT: VARIANCE ESTIMATION NCES STANDARD: 5-2 PURPOSE: Given that most NCES sample designs have one or more of the following three characteristics: unequal probabilities of selection, stratification, and clustering, it is important to ensure that appropriate techniques for the estimation of variance in sample surveys are identified, implemented and documented. KEY TERMS: clustered samples, confidentiality, Data Analysis System (DAS), DEFT, design effect (DEFF), estimation, imputation, raking, point estimate, replication methods, Simple Random Sampling (SRS), strata, Taylor-series linearization, and variance.
Approximate variance estimation methods that adjust for most of the impact of clustering and stratification include bootstrap, jackknife, Balanced-Repeated Replication (BRR), and Taylor-series linearization. Replication methods (bootstrap, jackknife, and BRR) can also adjust for the impact of nonresponse, post-stratification, and raking. When replication methods are used, the number of replicates should be large enough to enable stable variance estimation (e.g., ³ 30) and small enough (e.g., £ 100) for efficient calculation. GUIDELINE 5-2-2A: The preferred way to derive appropriate variance estimates for totals, means, proportions and regression coefficients is to use a statistical package that does not assume simple random sampling (SRS). Such packages include SUDAAN, WesVar, DAS, or Stata, and use such techniques as Taylor-series linearization or one of the replication methods mentioned above. GUIDELINE 5-2-2B: Consideration should be given to incorporating an adjustment for imputations in variance estimation procedures. GUIDELINE 5-2-2C: In some cases, alternative approximation strategies can be used to produce variance estimates. For example, software for multilevel models can be used to produce estimates that take into account some aspects of complex survey design. Care must be taken to include any clustering of the sample as a level in the model(s). In addition, any design variables and weights, such as those associated with strata or measures of size, should be taken into account.
Kish, L., Frankel, M. R., Verma, V., and Kaciroti, N. (1995). "Design effects for correlated (Pi-Pj)," Survey Methodology, 1995, 21: 117-124 (for an example on design effects for estimates of differences between proportions). Pfeffermann, D. (1996). "The use of sampling weights for survey data analysis," Statistical Methods in Medical Research, 1996, (5) pp. 239-261. Skinner, C. J., Holt, D., and Smith, T. M. F. (Eds.). (1989). Analysis of Complex Surveys, New York: Wiley. Lehtonen R. and Pahkinen, E. J. (1995). Practical Methods for Design and Analysis of Complex Surveys. New York, NY: Wiley. Pothoff, R. F., Woodbury, M. A., and Manton, K. G. (1992). "Equivalent sample size and equivalent degrees of freedom: refinements for inference using survey weights under superpopulation models." Journal of the American Statistical Association, 87, pp. 383-396. Goldstein, H. and Rasbash, J. (1998) Weighting for Unequal Selection Probabilities in Multilevel Models, Journal of the Royal Statistical Society, Series B, (60), pp. 23-40. Jones, K. (1992). "Using Multilevel Models for Survey Analysis." In Westlake, A. (Ed.), Survey and Statistical Computing. New York: North Holland. pp. 231-242. Goldstein, H. (1991). "Multilevel Modeling of Survey Data." The Statistician, 40, pp. 235-244. |