National Cancer Institute

Cancer Control and Population Sciences - NCI's bridge to public health research, practice and policy

Cancer Control and Population Sciences Home

Celebrating 10 Years
Celebrating 10 Years of Research
  Research Pioneers
  MERIT Awardees
  Star RO1 Investigators
  BSA/NCAB Members

Need Help?
Search:


Cancer Control Research

5R01CA106355-03
Chen, Hua Yun
NOVEL STATISTICAL METHODS FOR DATA WITH MISSING VALUES

Abstract

DESCRIPTION (provided by applicant): Missing covariate values are common in studies of risk factors of diseases and in many other biomedical studies. Simple complete-case analysis which is routinely used suffers from bias in addition to efficiency loss. Current advanced statistical methods for analyzing such data have limited usage in practice because of the robust concern, or the difficulty in implementation, or both. This project aims at developing new statistical methods for modeling missing covariates in regression models to make inferences on regression parameters with missing covariates robust, efficient, and easy to implement. The objective is to be reached through four steps: (1) A general semi-parametric odds ratio model is proposed for complex missing data problems. The proposed model makes the likelihood approach commonly used in practice more robust and flexible, and easy to apply. (2) The likelihood method for regression with missing data is further robustified in three ways. When missing patterns are relatively simple, smoothing spline models for odds ratio function is proposed; When missing patterns are complex, likelihood estimator is modified to be doubly robust and locally efficient; A framework is proposed for sensitivity analysis with general missing data mechanisms. (3) For problems with a large number of covariates subject to missing values, model selection procedures are studied based on imputed complete data under the semiparametric covariate model. Such procedures can be very helpful in studying risk factors of health events, such as in identifying risk factors of bone fracture from a set of potential risk factors subject to missing values. (4) For all the missing data problems under consideration, software for implementing methods of the research outcomes will be developed and disseminated. The proposed research, when completed, will make analyses of biomedical data with missing covariate values more accessible to researchers in many applied fields and thus promote efficient use of valuable data, such as those from HIV and cancer studies.

Search | Help | Contact Us | Accessibility | Privacy Policy

DCCPSNational Cancer Institute Department of Health and Human Services National Institutes of Health USA.gov

DCCPS home DCCPS home DCCPS home