Skip navigation links
 
NIGMS Home | Site Map | Staff Search

Southeast Collaboratory for Structural Genomics (SECSG)


PI:  Bi-Cheng Wang, Ph.D., University of Georgia


Better Tools and Better Knowledge for Structural Genomics

Bioinformatics

Information integration system for integrating experiment data and bioinformatics information from heterogeneous sources. The XML-based system is used for automated progress report generation, data mining, and automatic data release from structural genomics efforts.
WEB interface generator for bioinformatics and crystallographic applications.  It provides a user-friendly interface to complicated bioinformatics and crystallographic software tools.
Pipeline workflow control (Figure 1) for assembling bioinformatics and crystallographic software tools. This technology maximizes the performance of software tools by fine-tuning their program input parameters.
Bioinformatics system administration tools for easing system administration for groups that have many people and/or computers.  This Linux based tool contains the most popular bioinformatics programs and module libraries.

Protein Production

Small-scale solubility screen for rapid detection of soluble recombinant P. furiosus proteins. Utilizing both the small-scale screen/pipeline and the specialized expression methods, it should be possible to solubly express a higher percentage of any organism’s proteome.
Novel ELISA solubility screen for C. elegans genes that compare the OD’s of the whole lysate to the soluble fraction.  This assay is also capable screening of samples with different expression levels (100 fold) on the same 96 well plate.
Alternate expression protocols using the trc promoter for expression of eukaryotic proteins.  The promoter produces soluble folded proteins with cofactors and is induced metabolically resulting in a savings of both time and money.

Crystallography                                                                                                                  

HT Crystallization pipeline (Figure 2) capable of screening (initial crystallization conditions, crystal imaging, crystal optimization and crystal diffraction characterization) 2000 or more proteins per year.  The pipeline uses commercial robots and a locally developed data management system to ensure samples are tracked throughout the process.
Target salvaging using a two-tier approach to crystal production. The two-tier approach allows the protein production allows Tier-1 efforts to focus on the primary job of producing new targets while Tier-2 efforts are focused on producing protein repeats and recovering targets that initially fail.
Direct Crystallography (Figure 3) and SAS structure determination. SECSG is building SAS phasing tools for unlabelled native crystals.   SECSG has championed SAS structure determination and based on the recent statistics from the PDB (06/23/04) SAS structure determination showed a 246% increase over past year (88 SAS structures versus 119 MAD structures.)
Signal based data collection is a new strategy for data collection in which we monitor the anomalous signal for increasing the signal to noise ratio in the data.
Automated structure determination (Figure 4) using bioperl based automated pipelines.  These cluster-based pipelines are capable of producing refined structures in a matter of hours from a scaled anomalous scattering data with little user input.
Quality control (Figure 5) using the Richardson MolProbity analysis Traditional quality indicators (such as the "free" R factor) show a significant improvement upon MolProbity testing and correction.

NMR

Direct determination of an accurate protein backbone from easily acquired NMR data. Focusing on the backbone structure using residual dipolar coupling data has reduced data acquisition times by approximately a factor of three relative to conventional NOE based methods.

 
 
Figure 1. Pipeline workflow control system for assembling bioinformatics and structure determination pipelines.

 
 
 
Figure 2.  HT Crystal production pipeline.  Current pipeline capacity is 12 proteins per day.

 
 
 
 
Figure 3.  Powerful in-house SAS structure determination capabilities augment SECSG synchrotron data collection for low cost HT structure solution.

 
 
 
 
Figure 4. SECSG automated structure solution pipelines uses cluster-based parallel job submission to fine-screen program input parameter space to find the optimal parameter combination, thus increasing the chance for success.

 
 
 
 
Figure 5.  Production of high quality structures is an important goal of SECSG’s structure production efforts.
This page last updated November 19, 2008