Statistical Engineering Division SeminarNonparametric, Hypothesis-based Analysis of High-Dimensional Data
Dr. Jeanne Kowalski Abstract In this talk, I describe two novel nonparametric, inference-based approaches for analyses of genetic and genomic heterogeneity associated with groups of similar phenotype. A common theme between them is the construction of testable hypotheses within the confine of very few samples from data of very high dimension. With a modest sample size, I discuss a distance-based approach to analysis of genetic heterogeneity based on population sequence data. In the more extreme case of data on several single sample groups to be compared from a microarray experiment, I introduce the concept of stochastic linear hypotheses that generalizes the Mann-Whitney Wilcoxon rank sum test to accommodate greater than two group comparisons in a multivariate setting. In each case, I also discuss bioinformatics approaches to characterize observed heterogeneity differences among groups of similar phenotype, either in terms of either locations within a sequence or genes within a genome. As motivation for the methods, I examine two separate problems, one for relating sequence differences in a region of the HIV genome to drug resistance, and a second for relating gene expression differences to hypothesized pathways for immunogenetic analysis of T cells. NIST Contact: John Lu, (301) 975-3208.
Date created: 11/15/2007 |