Advanced Proteomic Platforms and Computational Sciences for the NCI Clinical Proteomic Technologies for Cancer

The Advanced Proteomic Platforms and Computational Sciences initiative is a comprehensive program focused on the development of innovative new tools, reagents, and the enabling of technologies for protein/peptide measurement, such as algorithm development and computational methods to interrogate emerging pre-processed data sets. It also sets out to establish the Advanced Platforms, Data Analysis Methods, and Computational Sciences components of the NCI Clinical Proteomic Technologies for Cancer. The Advanced Proteomic Platforms and Computational Sciences initiative supports two focus areas for protein measurement technology and application in cancer research:

The development of innovative high-throughput technology for protein and peptide detection, recognition, measurement, and characterization in biological fluids that will overcome current barriers in protein/peptide feature detection, identification, quantification, and validation.
The development of computational, statistical, and mathematical approaches for the analysis, processing, and facile exchange of large proteomic data sets.

Advancing the technological and analytical capabilities in proteomic research will allow the research community to better characterize and understand the differences between the normal and diseased human proteome and to develop diagnostic and treatment procedures based on these distinctions.

Computational Sciences

Proteomic Characterization of Alternate Splicing and cSNP Protein Isoforms
Georgetown University Medical Center*
Principal Investigator: Nathan J. Edwards, Ph.D.

Nathan J. Edwards, Ph.D. Dr. Edwards received a Ph.D. in Operations Research from Cornell University in 2001. Joining the Informatics Research group at Celera Genomics, Dr. Edwards worked on SCOPE, for identifying peptides from tandem mass spectra by searching protein sequence databases, and other critical elements of the analysis infrastructure for Celera's high-throughput proteomics facility. Moving to Applied Biosystems, still as part of the Informatics Research group in 2002, he led research on algorithmic and statistical issues arising in the analysis of proteomics biomarker workflows and developed the Biomarker Toolbox prototype.

Since joining the Center for Bioinformatics and Computational Biology at the University of Maryland, College Park, in 2004, Dr. Edwards' research has focused on the discovery of novel peptides that characterize alternative splicing, coding SNPs, and mutant protein isoforms, using genomic and EST sequences; and on the rapid identification of microorganisms by MALDI-TOF mass spectrometry and bioinformatics, in collaboration with researchers at University of Maryland, College Park; and the Johns Hopkins School of Public Health. *In 2008, Dr. Edwards became an assistant professor in the Department of Biochemistry and Molecular & Cellular Biology at Georgetown University Medical Center.

The characterization of alternative splice and variant protein isoforms is a fundamental limitation of current proteomic workflows. To address this issue, this research team is developing an infrastructure to enable characterization of alternative splicing and coding isoforms of single nucleotide polymorphisms.

Enhancement of MS Signal Processing Toward Improved Cancer Biomarker Discovery
College of William and Mary
Principal Investigator: Dariya Malyarenko, Ph.D.

Dariya Malyarenko, Ph.D. Dr. Malyarenko is a technical lead for the collaborative initiative between the College of William and Mary, INCOGEN, Inc., and Eastern Virginia Medical School targeted at developing new algorithms and software tools for signal processing and statistical analysis of mass spectrometry data with intended use for cancer biomarker discovery. Her area of expertise is in experimental and computational physics applied to analysis of nuclear magnetic resonance and mass spectrometry data for synthetic polymers and biomolecules. Her current research interests are in time series analysis, noise filtering, resolution enhancement, and classification of spectroscopic data for cancer profiling proteomics.

To increase the effectiveness of cancer protein/peptide detection from label-free Matrix Assisted Laser Desorption Ionization Time-of-Flight (MALDI-TOF) mass spectra for verification and identification, this group is developing novel computational tools that can be used across all laboratories employing this MS technology.

A Platform for Pattern-Based Proteomic Biomarker Discovery
Massachusetts Institute of Technology
Principal Investigator: Denkanikota Mani, Ph.D.

Dr. Mani is a Senior Computational Biologist in the Cancer Program and Proteomics Group at the Broad Institute of MIT and Harvard. He is an experienced computational scientist with extensive training and expertise in computer science; pattern recognition and machine learning; data mining; and parallel computing. He has been directly responsible for the design and implementation of pattern-based proteomic data analysis methods for biomarker discovery from a range of mass spectrometric data. The proteomics pipeline he developed has been used to analyze a variety of human and animal tissues representing a spectrum of cancers and to discover biomarker candidates for enabling early detection, diagnosis and targeted therapeutics. Dr. Mani has also applied computational pattern recognition methodology to address biological problems involving diverse data ranging from gene expression profiles to small molecule screens.

As part of the current program, Dr. Mani and his collaborators at The Broad Institute are working on developing robust, high-throughput methods for proteomic biomarker discovery using high information content mass spectrometry data from state-of-the-art instrumentation. This novel platform combines peptide identity with high-resolution mass spectrometric pattern to identify potential biomarkers comprising both identified peptides/proteins, and statistically significant mass spectrometric peaks and patterns easily amenable to subsequent identification.

Dr. Mani has a Bachelor's degree in Electronics & Telecommunication Engineering from Bangalore University, India, a Master's degree in Computer Science from the Indian Institute of Technology at Kanpur, and a Ph.D. in Computer and Information Science from the University of Pennsylvania at Philadelphia. Prior to joining The Broad Institute, he was a Principal Member of Technical Staff at Verizon Laboratories, where he applied pattern recognition to diverse business and customer data in order to glean insights enabling informed and data-driven corporate decision making. He was also a Principal Research Engineer at Thinking Machines Corporation (now part of Oracle), where he designed and implemented massively parallel data mining algorithms, and applied them to mining large data warehouses.

To construct and validate a software system for protein/peptide pattern discovery, this research team will combine peptide identity and pattern information obtained from high resolution and high mass accuracy spectra. Application involves the use of peptide identifications via tandem MS throughout the processing of the data set, while still allowing quantification and comparison of unidentified peptide signals.

Analysis and Statistical Validation of Proteomic Datasets
University of Michigan
Principal Investigator: Alexey I. Nesvizhskii, Ph.D.

Dr. Nesvizhskii received an M.S. (with honors) from St. Petersburg State Technical University, Department of Physics and Technology, St. Petersburg, Russia in 1995 and a Ph.D. in Physics from the University of Washington, Seattle in 2001. He completed postdoctoral training in Ruedi Aebersold Lab at the Institute for Systems Biology in Seattle, Washington from 2001-2003 and joined the staff as a Research Scientist upon completion of training.

Dr. Nesvizhskii was the recipient of a medal for "Best Student Scientific Work" awarded by the Russion Federation State Committee of Higher Education and was named Russian Presidential Fellow for the period 1994-1995 and Soros Fellow for the period 1995-1996. He is a member of the Human Proteome Organization (HUPO), International Society for Computational Biology, and the American Society for Mass Spectrometry.

In November 2005, Dr. Nesvizhskii joined the faculty of the Department of Pathology at the University of Michigan as an Assistant Professor.

Dr. Nesvizhskii’s research interest is in the field of quantitative proteomics, with a focus on the development of computational methods for processing and extracting biological information from complex proteomic datasets. Similar to other global high throughput technologies such as microarray gene expression analysis, proteomics is extremely dependent on the ability to quickly and reliably analyze large amounts of experimental data. One of the aims of Dr. Nesvizhskii’s research is to close the critical gap between the development of high throughput quantitative proteomics methods and the ability to deal with the resulting data deluge and to convert it into new biological knowledge or to develop new disease biomarkers. The efforts in his lab range from the development of computational tools and statistical methods for mass spectrometry-based peptide and protein identification and quantification, to the establishment of guidelines and standards for proteomic data analysis and publication, to the creation of public databases and proteomic data repositories and integration of proteomic with genomic and other types of biological data.

Building more reliable statistical algorithms and models for analyzing large proteomic data sets is the goal of this research group. These algorithms and models are necessary to make peptide assignments to spectra from tandem mass spectrometry (MS/MS), inferring proteins by assembling identified peptides, estimating quantitative changes, assessing the quality of MS/MS data and spectra, and analyzing MS/MS data from cross-laboratory multiple studies.

Quantitative Methods for Spectral and Image Data in Proteomics Research
Fred Hutchinson Cancer Research Center
Principal Investigator: Timothy W. Randolph, Ph.D.

Timothy W. Randolph, Ph.D. Dr. Randolph is a Senior Fellow at the University of Washington Department of Biostatistics and a Senior Staff Scientist at the Fred Hutchinson Cancer Research Center's program on Biostatistics and Biomathematics. He is completing a career transition from pure mathematics to one focusing on the analysis of high-dimensional data from proteomic platforms for which he is developing methods for signal processing, statistical analysis, and classification.

Prior to his current positions, Dr. Randolph was an Associate Professor of Mathematics and Statistics at the University of Missouri-Rolla where his research focused on abstract dynamical systems, including the mathematics of control theory. His new research draws on this background using tools from harmonic/wavelet analysis, functional analysis, and operator theory.

Dr. Randolph received his Ph.D. in mathematics from the University of Oregon. While at the University of Missouri, his research was facilitated by several awards from the Missouri Research Board. In 2002, he was awarded a grant from the NIH Institute of General Medical Sciences program on Mentored Quantitative Research Career Development, allowing him to turn his attention to problems related to health and disease. He has authored papers in journals ranging from pure mathematics to computer methods and engineering, to molecular proteomics.

There is a rapidly growing need for rigorous quantitative methods that increase the power to perform comparative proteomics for current and upcoming platforms in proteomic research. This team hopes to meet that need through the use of wavelet scale functions to define peaks, and a penalized regression model to align spectra.

Computational Tools for Cancer Proteomics
University of Colorado at Boulder
Principal Investigator: Katheryn A. Resing, Ph.D.

Dr. Resing holds a Ph.D. from the University of Washington, Seattle, and did her postdoctoral fellowship at the University of Washington, Seattle. Her research centers around developing global protein analysis of mammalian cells. The specific technique she uses involves digestion of an extract from cells into peptides, followed by multi-dimensional chromatography of the peptides, where the final stage is reverse phase coupled to a mass spectrometer. The information in this data is then used to search a protein or peptide database, in order to identify the peptides. The use of peptide databases provides more sensitive search results in situation where normal search methods create a huge database, e.g., for splice junctions, alternative start/stop sites, unannotated genes, and modified peptides.

A large part of her research involves developing new methods for data management and mining, including new methods for validating the peptide assignments, and for quantifying and comparing different samples or samples analyzed using different mass spectrometry (MS) instruments and methods. She is also developing new informatics tools utilizing this information in order to tackle several biological and clinical problems, that is, how does the proteome reflect modification of signaling processes in the cell, including expression, splicing, and phosphorylation.

The specific questions currently under investigation are analysis of melanoma progression biology, differentiation of K562 cultured cells into erythocyte or megakaryocyte lineages, and changes in response to MKK1/2 or MKK5 in neuronal cells, specifically in hippocampus and PC12 cells.

Computational methods needed to quantify protein expression changes, to increase the accuracy of peptide and protein identification from tandem mass spectrometry (MS/MS) spectra, to improve phosphoproteomics analysis, and to cluster multidimensional peptides and proteins between many samples will be the focus of this group of scientist.

New Proteomic Algorithms to Identify Mutant or Modified Proteins
Vanderbilt University
Principal Investigator: David L. Tabb, Ph.D.

David L. Tabb, Ph.D. Dr. Tabb was named one of two White House Presidential Scholars for Missouri in 1992. He attended college at the University of Arkansas as a Sturgis Fellow. He majored in Biology and minored in Computer Science, graduating summa cum laude in 1996. His honors thesis described the design of software for the analysis of genetic sequences.

He carried this interest in bioinformatics to graduate school at the University of Washington's Molecular Biotechnology department. There he studied proteomics as a graduate student under John Yates. The laboratory moved to The Scripps Research Institute in 2000. Dr. Tabb worked closely with Vicki Wysocki's group at the University of Arizona in characterizing peptide fragmentation during low-energy Collision Induced Dissociation through a series of key papers. He created software for data mining proteomic results to extract biological information more consistently and rapidly; "DTASelect" has become one of the most widely used software tools in proteomics. He leveraged his understanding of peptide fragmentation to create the first fully-automated sequence tag infrastructure in the "GutenTag" software package.

In 2003, Dr. Tabb completed his Ph.D. and began a postdoctoral fellowship at Oak Ridge National Laboratory. While there, he developed "DBDigger," a novel database search algorithm that streamlined the process of proteomic identification. He focused on reducing the redundancy of these data sets through the creation of "MS2Grouper," a tool that employs graph theory to examine spectral interrelationships. Dr. Tabb also examined the use of high-resolution mass spectra for the purpose of peptide charge state inference.

In 2005, Dr. Tabb joined the faculty of Vanderbilt University Medical Center to lead a group in the Mass Spectrometry Research Center. In 2006, he was appointed as an assistant professor in the Biochemistry Department as well. His work focuses on identifying peptides more successfully from clinical samples, specifically peptides that have mutated or modified peptide sequences.

The development of new proteomic algorithms to identify protein mutations and modifications is a critical need. If successful, this research team's efforts could lead to a highly useful methodology and computer infrastructure with high-throughput for accurate identification of mutations and modifications.

PICquant-An Integrated Platform for Biomarker Discovery
University of Virginia
Principal Investigator: Dennis J. Templeton, Ph.D.

Dennis J. Templeton, M.D., Ph.D. Dr. Templeton received his M.D. and Ph.D. degrees at the University of California Southern California, and trained in Pathology at the New England Deaconess Hospital. He has long been active in cancer research, earning his Ph.D. at the Salk Institute studying tumor virology, and a postdoc at the Whitehead Institute in molecular oncology and signal transduction. After ten years at Case Western Reserve University, he was appointed Chair of Pathology at the University of Virginia. As part of the mission of pathology to discover improved diagnostics, he established the Pathology MS Facility and the biomarker discovery program, and has been closely involved in the day-to-day work, particularly that involving peptide chemistry.

The Program is highly integrated. It coalesces clinical informatics, quantitative proteomics, and automated data processing routines to allow rapid analysis of data from dozens of individual patients that would be impractical using manual analysis, with the goal of realizing the potential of peptide diagnostics in clinical medicine.

One promising proteomic application is the potential for a complete analytic platform for urine biomarker discovery. Using PIC labeling, this research team seeks to develop a new labeling reagent for peptides, in addition to a clinical registry that links acquired urine specimens to current and prospective clinical information, including outcomes. The registry enables multivariate clustering of disease states with quantified protein families.

Advanced Proteomic Platforms

Proteomic Phosphopeptide Chip Technology for Protein Profiling
University of Houston
Principal Investigator: Xiaolian Gao, Ph.D.

Xiaolian Gao, Ph.D. Dr. Gao is a Professor of Biology and Biochemistry and Chemistry, and Director of the Keck/IMD NMR Center at the University of Houston. Pursuing research in the interdisciplinary areas of chemistry and biology, she developed novel methods for miniaturizing massively parallel synthesis of biomolecules on high-density, microfluidic microchips and multiplex biochemical assays on the same chip platform; established the microchip based gene synthesis for many genomic and proteomic applications; and demonstrated microchips to be used as pico-liter titer plates for ultra-high throughput quantitative measurements of binding and enzymatic activities. Dr. Gao's second major research area is solution structures and studies of nucleic acids, proteins, and biomolecular complexes. Dr. Gao holds a B.S. degree from the Beijing Institute of Chemical Engineering and Ph.D. degree in Chemistry from Rutgers University; she did postdoctoral work in NMR-based structure biology at Columbia University Medical School. Before joining the faculty at the University of Houston, she was a Principal Investigator at the Glaxo Research Institute. Dr. Gao is a devoted educator, mentor, and founder of several biotechnology companies.

To develop a novel proteomic phosphopeptide microchip technology platform that can profile proteins carrying phosphopeptide binding domains, this research group is taking a comprehensive approach to build all the necessary parts, including software development, chip fabrication, and construction of analytic tools.

Global Production of Disease-Specific Monoclonal Ab's
Northeastern University
Principal Investigator: Barry L. Karger, Ph.D.

Barry L. Karger, Ph.D. Dr. Karger holds a Ph.D. in chemistry from Cornell University and a B.S. in chemistry from the Massachusetts Institute of Technology. His research focuses on the application of microscale separations to problems of biological significance. Current research involves (1) biomarker discovery and monitoring; (2) ultrasensitive LC-ESI-MS using narrow bore monolithic (20 µm i.d.) and porous layer open tubular (PLOT, 10 µm i.d.) columns coupled to MS; (3) laser capture microdissection analysis of cervical scrapings in conjunction with Pap smear tests; (4) ultratrace characterization of complex proteins, including post-translational modifications using extended range proteomic analysis (ERPA); (5) ultratrace fast MALDI-TOF/TOF MS using a 2 kHz laser; and (6) monoclonal antibodies in biomarker discovery and monitoring. The laboratory provides an interdisciplinary environment for research in collaboration with academic, medical and industrial scientists.

This research group seeks to demonstrate the feasibility of a global approach to the generation of disease specific monoclonal antibodies (mAbs) to low-level proteins for the discovery and validation of biomarkers to cancer.

Top-Down Mass Spectrometry of Salivary Fluids for Cancer Assessment
University of California Los Angeles
Principal Investigator: Joseph A. Loo, Ph.D.

Joseph A. Loo, Ph.D. Dr. Loo is a Professor in the Department of Biological Chemistry, David Geffen School of Medicine, and in the Department of Chemistry & Biochemistry at the University of California, Los Angeles (UCLA), and is the Director of UCLA Jonsson Comprehensive Cancer Center Mass Spectrometry and Proteomics Shared Resource. He is also a member of UCLA/DOE Institute for Genomics and Proteomics, the UCLA Molecular Biology Institute, and the UCLA Jonsson Comprehensive Cancer Center.

His research interests include the development of bioanalytical mass spectrometry methods for the structural characterization of proteins and their application for proteomics and disease biomarkers. He is also interested in development and application of electrospray ionization mass spectrometry for the study of noncovalently-bound macromolecular complexes and their interactions with other binding partners and ligands. He is an author of over 140 scientific publications. He has been on the Editorial Boards of Bioconjugate Chemistry and Analytical Chemistry, and currently he serves on the Editorial Advisory Boards for the Journal of the American Society for Mass Spectrometry, Rapid Communications in Mass Spectrometry, and Chemical & Engineering News. From 2000 to 2002, he served on the Board of Directors for the American Society for Mass Spectrometry.

Before he joined UCLA, he was Group Leader of the Biological Mass Spectrometry and Proteomics Team at Parke-Davis Pharmaceutical (currently Pfizer Global Research), Ann Arbor, MI. He worked at Warner-Lambert/Parke-Davis/Pfizer for nearly 10 years before moving to UCLA.

He received his B.S. degree in Chemistry from Clarkson University (Potsdam, NY), and his Ph.D. degree in analytical chemistry from Cornell University in 1987 (Prof. Fred W. McLafferty). He carried out postdoctoral research at Pacific Northwest National Laboratory, Richland, WA (Dr. Richard D. Smith).

The "top-down" approach has great potential for structural characterization of known proteins, and may even become a new tool for the identification of unknown proteins. The goal of this research group is to develop a new type of ion source, electrospray-assisted laser desorption (ELDI) for top-down sequencing of salivary proteins.

A New Platform to Screen Serum for Cancer Membrane Proteins
Institute for Systems Biology
Principal Investigator: Daniel B. Martin, M.D.

In an effort to obtain better diagnostic markers of prostate cancer, this group will develop and implement a proteomic platform for the capture and analysis of membrane glycoproteins in cell culture models of the disease. The goal of this work is to define a rapid, specific, reliable, and inexpensive strategy to identify and validate prostate cancer protein markers.

A Proteomics Approach to Ubiquitination
Emory University
Principal Investigator: Junmin Peng, Ph.D.

An accurate and quantitative biochemical analysis of the ubiquitination proteome of mammalian tissues and human brain tumors has yet to be carried out. Using high-resolution mass spectrometry, this team's efforts will go to providing a new and powerful preparative proteomic technology to capture and isolate this interesting, and largely uninvestigated, class of molecules.

A Proteomics Platform for Quantitative, Ultra-High Throughput, and Ultra-Sensitive
Battelle Pacific Northwest Laboratories
Principal Investigator: Richard D. Smith, Ph.D.

Richard D. Smith, Ph.D. Dr. Smith's research has involved the development and application of advanced analytical methods and instrumentation, with particular emphasis on high-resolution separations and mass spectrometry, and their applications in biological and biomedical research. Much of his research over the last 15 years has focused on the development and application of new ultra-sensitive and comprehensive methods for quantitatively probing the entire array of proteins expressed by a cell, tissue or organism, i.e., their "proteomes." Current interests also include greatly increasing the throughput and sensitivity of proteomics to meet the needs in systems biology research,
and biomarker discovery.

Dr. Smith is an adjunct faculty member of the Department of Chemistry, Washington State University, and the Department of Chemistry, University of Utah, and an affiliate faculty member of the Department of Chemistry, University of Idaho. He has presented more than 350 invited or plenary lectures at national and international scientific meetings, and is the author or co-author of more than 600 publications. Dr. Smith holds twenty-seven patents and has been the recipient of seven R&D 100 Awards. He is Director of the NIH Research Resource for Integrative Proteomics located at PNNL.

Many proteins of relevance to cancer are of extremely low abundance in clinical samples, making them difficult to detect reliably. To address this problem, the research group seeks to develop a cancer protein/peptide assessment platform for analyses of clinically relevant samples that will provide measurements that are much more robust, are of higher sensitivity, provide more than order of magnitude throughput, and have improved quantitative utility, particularly for low-abundance proteins, compared to existing platforms.

Aptamer-Based Proteomic Analysis for Cancer Signatures
Michigan State University
Principal Investigator: Stephen P. Walton, Ph.D.

Stephen P. Walton, Sc.D. Dr. Walton received his B.ChE. from Georgia Tech, and an M.S. (Chemical Engineering Practice) and Sc.D. in the Department of Chemical Engineering at MIT. While at MIT, he was awarded a Shell Foundation Fellowship and was an NIH Biotechnology Predoctoral Trainee. Upon completion of his Sc.D., he joined the Stanford Genome Technology Center, receiving an NIH Kirschstein postdoctoral fellowship for his research. His research is focused on the application of genomics tools to the measurement of DNA, RNA, and protein expression profiles, as well as the engineering of active biomolecules through kinetic and thermodynamic design.

To test the potential for aptamers to detect specific proteins in biological samples, this scientific team will use bead-based or oligonucleotide arrays of molecular bar-codes to detect protein-binding aptamers containing molecular bar-codes.