Version 2.5.2.0 CRISP Logo CRISP Homepage Help for CRISP Email Us

Abstract

Grant Number: 1R01LM007218-01A1
Project Title: Intron evolution: automated phylogentic analysis system
PI Information:NameEmailTitle
STOLTZFUS, ARLIN B. stoltzfu@umbi.umd.edu

Abstract: DESCRIPTION (provided by applicant): Comparative sequence analysis plays an increasingly prominent role in genome annotation, drug target discovery, biomolecule engineering, and medical genetics. The gold standard of comparative sequence analysis is phylogenetic character analysis based on a multiple alignment (MSA) of sequence family members, inference of a phylogenetic tree from the MSA, and analysis of reconstructed changes in sequence features of interest (active site residues, splice junctions, and so on). In this project, an automated system to facilitate evolutionary analysis will be developed, tested, and applied to unresolved issues in the evolution of spliceosomal introns. A software pipeline to assemble sequence family data sets (sequences, MSAs, intron sites, trees) for eukaryotic nuclear protein-coding genes will be developed and tested. Data sets produced in a NEXUS-based standard exchange format will be loaded into SPAN, a database/analysis system that will provide i) a relational schema suitable for storing sequence family data sets; ii) taxonomic query functions based on a comprehensive taxonomic hierarchy; iii) reconstruction of evolutionary changes; iv) query and retrieval based on tree topology, branch lengths, and reconstructions; v) explicit treatment of quality or uncertainty in sequence annotations and evolutionary inferences. Using the software pipeline and SPAN, a series of databases with 20-200 sequence families will be used to evaluate the role of targeted intron gain, and more generally the role of recent events of intron gain and loss, in accounting for non-randomness in the distribution of introns in genes and genomes. After obtaining a refined estimate of the nucleotide preferences of intron gain using evolutionary reconstructions, the implications of targeted gain will be evaluated with respect to biases in intron phase frequencies, amino acid composition near intron sites, and protein structure biases near intron sites. The proposed research will resolve long-standing issues concerning the evolutionary history of split genes, and the software systems developed will represent a major methodological advance with broad implications for bioinformatics.

Public Health Relevance:
This Public Health Relevance is not available.

Thesaurus Terms:
biochemical evolution, computer assisted sequence analysis, computer program /software, computer system design /evaluation, eukaryote, gene expression, intron, spliceosome
biomedical automation, family genetics, genetic library, molecular biology information system

Institution: UNIVERSITY OF MD BIOTECHNOLOGY INSTITUTE
701 E PRATT STREET, SUITE 200
BALTIMORE, MD 212023101
Fiscal Year: 2002
Department: NONE
Project Start: 15-SEP-2002
Project End: 14-SEP-2005
ICD: NATIONAL LIBRARY OF MEDICINE
IRG: GNM


CRISP Homepage Help for CRISP Email Us