2007 Progress Report: Carolina Environmental Bioinformatics Center
EPA Grant Number: R832720Center: The Carolina Environmental Bioinformatics Research Center
Center Director: Wright, Fred A.
Title: Carolina Environmental Bioinformatics Center
Investigators: Wright, Fred A. , Farber, Rosann , Hemminger, Brad , Rusyn, Ivan , Stotts, David , Tropsha, Alex
Institution: University of North Carolina at Chapel Hill
EPA Project Officer: Mustra, David
Project Period: October 1, 2005 through September 30, 2010
Project Period Covered by this Report: October 1, 2006 through September 30, 2007
Project Amount: $4,494,117
RFA: Computational Toxicology: Environmental Bioinformatics Research Center (2004)
Research Category: Computational Toxicology , Health Effects
Description:
Objective:The objective of the Carolina Environmental Bioinformatics Center is to enhance and advance the field of computational toxicology by developing novel analytical and computational methods for interpreting toxicological data. The CEBC also encourages the use of these methods within the broader research community by creating publicly available, user-friendly tools and interfaces.
Progress Summary:The Carolina Environmental Bioinformatics Center is arranged with three main research projects. Project 1 (Biostatistics) is developing new methods to analyze computational toxicology data and applies the methods to relevant environmental datasets. Project 1 has focused on the main areas of efficiently identifying differentially expressed genes, performing microarray quality control, improving prediction procedures, and performing expression QTL analysis in which genotype is related to expression and toxicity response. Application of the approaches have been performed for the EPA-generated datasets, and implemented as publicly available code in the language R. Further code development in C and Java is proceeding rapidly. Project 2 (Chem-Informatics) is applying Quantitative Structure-Toxicity Relationship (QSTR) modeling to predictive computational toxicology. Formalized toxicity data models and public toxicity data schemas allow for flexible data mining and relational data searching across layers of chemical and biological information. We have analyzed data in the Carcinogenic Potency Database (CPDB) and in the National Toxicology Program (NTP) sponsored High-Throughput Screening (HTS) campaign, as well as EPA ToxCast and DSSTox datasets. We have developed rigorously validated, predictive QSTR models that demonstrate the utility of toxico-cheminformatics for predictive toxicology, and are developing standard workflows for this activity. Project 3 (Computational Infrastructure for Systems Toxicology) is creating new computing code and algorithms to assist with efforts in the other Research Projects. In addition, we are developing methods and software to (i) dissect genetic networks that control liver gene expression; (ii) perform time-course and dose-dependent analysis of gene expression data; (iii) perform fast analysis of gene expression and genotype applicable to toxicity datasets.
Future Activities:For Project 1, the shift in priorities toward practical coding of user-friendly software is continuing in Year 3. The eQTL analysis software development will constitute a major part of the effort, in which Project 1 investigators advise programmers in Project 3. Bootstrap-SAFE and transcription factor SAFE are also high priorities for continuing code development. Some of the eQTL analysis effort in Year 3 will be directed to analysis of HapMap CEU cell lines for association with a toxicity assay panel and expression arrays, in collaboration with the Project 3 group. HAP-SAMPLE and related methods will be explored for utility in assessing eQTL analysis methods. Direct collaborations with EPA NCCT personnel are expected to be maintained at the current level or to increase. The expansion of ToxCast has provided considerable illustrative data for testing data-mining approaches in the context of computational toxicology, and these efforts will continue.
For project 2, the Year 3 efforts can be categorized as continuing with (i) QSAR modeling of multiple animal toxicity endpoints; (ii) novel QSAR methodology development in toxicity studies; (iii) follow up studies of the NTP-HTS project; (iv) pre-clustering compounds in toxicity modeling. For all of these activities we rely on data collected under the ToxCast, DSSTox, and ACToR projects.
Activities for next year in Project 3 include: (i) integration/support of tools from other CEBC projects; (ii) continued assistance with programming improvements to algorithms in individual tools and applications as needed; (iii) development of specific data-mining algorithms for genomic databases; (iv) extending our computational work on fast approaches for genome-wide expression QTL analysis to human haplotypes and much larger datasets; (v) development of the continuum of canonical correlation tools for multi-dimensional (e.g., gene expression and metabolomic) data sets; (vi) continued biology-driven research that generates appropriate datasets for testing and implementing novel computational and biostatistical approaches.
Journal Articles: 10 Displayed | Download in RIS Format
Other center views: | All 50 publications | 10 publications in selected types | All 10 journal articles |
Type | Citation | ||
---|---|---|---|
|
Gatti D, Maki A, Chesler EJ, Kirova R, Kosyk O, Lu L, Manly KF, Williams RW, Perkins A, Langston MA, Threadgill DW, Rusyn I. Genome-level analysis of genetic regulation of liver gene expression networks. Hepatology 2007;46(2):548-557. |
R832720 (2007) |
|
|
Graham MR, Virtaneva K, Porcella SF, Gardner DJ, Long RD, Welty DM, Barry WT, Johnson CA, Parkins LD, Wright FA, Musser JM. Analysis of the transcriptome of Group A Streptococcus in mouse soft tissue infection. American Journal of Pathology 2006;169(3):927-942. |
R832720 (2007) |
|
|
Hu J, Wright FA, Zou F. Estimation of expression indexes for oligonucleotide arrays using the singular value decomposition. Journal of the American Statistical Association 2006;101(473):41-50. |
R832720 (2006) |
not available |
|
Nadler JJ, Zou F, Huang H, Moy SS, Lauder J, Crawley JN, Threadgill DW, Wright FA, Magnuson TR. Large-scale gene expression differences across brain regions and inbred strains correlates with a behavioral phenotype. Genetics 2006;174(3):1229-1236. |
R832720 (2006) |
not available |
|
Roberts A, Pardo-Manuel de Villena F, Wang W, McMillan L, Threadgill DW. The polymorphism architecture of mouse genetic resources elucidated using genome-wide resequencing data: implications for QTL discovery and systems genetics. Mammalian Genome 2007;18(6-7):473-481. |
R832720 (2007) |
|
|
Tropsha A, Golbraikh A. Predictive QSAR modeling workflow, model applicability domains, and virtual screening. Current Pharmaceutical Design 2007;13(34):3494-3504. |
R832720 (2007) |
|
|
Woods CG, Vanden Heuvel JP, Rusyn I. Genomic profiling in nuclear receptor-mediated toxicity. Toxicologic Pathology 2007;35(4):474-494. |
R832720 (2007) |
|
|
Woods CG, Kosyk O, Bradford BU, Ross PK, Burns AM, Cunningham ML, Qu P, Ibrahim JG, Rusyn I. Time course investigation of PPARα-and Kupffer cell-dependent effects of WY-14,643 in mouse liver using microarray gene expression. Toxicology and Applied Pharmacology 2007;225(3):267-277. |
R832720 (2007) |
|
|
Wright FA, Huang H, Guan X, Gamiel K, Jeffries C, Barry WT, Pardo-Manuel de Villena F, Sullivan PF, Wilhelmsen KC, Zou F. Simulating association studies: a data-based resampling method for candidate regions or whole genome scans. Bioinformatics 2007;23(19):2581-2588. |
R832720 (2007) |
|
|
Zhu H, Rusyn I, Richard A, Tropsha A. The use of cell viability assay data improves the prediction accuracy of conventional quantitative structure activity relationship models of animal carcinogenicity. Environmental Health Perspectives 2008 Jan 4 [Epub ahead of print] doi:10.1289/ehp.10573. |
R832720 (2007) |
|
bioinformatics, biostatistics, computational toxicology, toxicogenomics, QSAR,
,
ENVIRONMENTAL MANAGEMENT, Scientific Discipline, Health, Risk Assessment, Biology, Risk Assessments, Biochemistry, Environmental Monitoring, exposure assessment, biochemical research, chemical composition, ecological risk assessment, toxicologic assessment, bioinformatics, human health risk, biopollution, biostatistics, dose-response, toxicology, environmental risks, risk, outreach and training, computational toxicology
Progress and Final Reports:
2006 Progress Report
Original Abstract