Nonparametric, Hypothesis-based Analysis of High-Dimensional Data

Statistical Engineering Division Seminar

Dr. Jeanne Kowalski
Assistant Professor of Oncology and Biostatistics
Johns Hopkins University
Statistical Engineering Division Seminar
Thursday, November 15, 2007, 3:00-4:00 PM
Building 222, Room A264

Abstract

In this talk, I describe two novel nonparametric, inference-based approaches for analyses of genetic and genomic heterogeneity associated with groups of similar phenotype. A common theme between them is the construction of testable hypotheses within the confine of very few samples from data of very high dimension. With a modest sample size, I discuss a distance-based approach to analysis of genetic heterogeneity based on population sequence data. In the more extreme case of data on several single sample groups to be compared from a microarray experiment, I introduce the concept of stochastic linear hypotheses that generalizes the Mann-Whitney Wilcoxon rank sum test to accommodate greater than two group comparisons in a multivariate setting. In each case, I also discuss bioinformatics approaches to characterize observed heterogeneity differences among groups of similar phenotype, either in terms of either locations within a sequence or genes within a genome. As motivation for the methods, I examine two separate problems, one for relating sequence differences in a region of the HIV genome to drug resistance, and a second for relating gene expression differences to hypothesized pathways for immunogenetic analysis of T cells.

NIST Contact: John Lu, (301) 975-3208.

Date created: 11/15/2007
Last updated: 11/15/2007
Please email comments on this WWW page to sedwww@cam.nist.gov.