Primary Navigation for the CDC Website
CDC en Español

Search:  

News & Highlights

Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching.

Du P, Kibbe WA, Lin SM.
Bioinformatics 2006;22:2059-2065 doi:10.1093/bioinformatics/btl355

Summary

Following the 2005 Cold Spring Harbor - Banbury Center CFS Computational Challenge (C3) Workshop, CDC provided data sets from the Wichita in-hospital clinical study to Duke University for use in the Sixth International Conference for the Critical Assessment of Microarray Data Analysis (CAMDA 2006).  Duke University founded CAMDA to provide a forum to critically assess different techniques used in microarray data mining.  CAMDA’s aim is to establish the state-of-the-art in microarray data mining and to identify progress and highlight the direction for future effort.  CAMDA utilizes a community-wide experiment approach, letting the scientific community analyze the same standard data sets.  Researchers worldwide are invited to take the CAMDA challenge and those whose results are accepted are invited to present a 25 minute oral presentation.  The 2006 CAMDA was the first to use a single common challenge data set, which contained all clinical, gene expression, SNP, and proteomics data from the Wichita clinical study.

To date 10 peer reviewed publications have resulted from the CAMDA challenge.  This was of the first analyses published following CAMDA and came from Dr. Lin’s analytic group at the Robert H. Lurie Comprehensive Cancer Center, Northwestern University, Chicago.  This analysis addresses the problem of identifying real peaks in the midst of the noise generated as part of mass spectrometry data, and doing so without extensive preprocessing of the data.  Their analysis developed a continuous wavelet transformation method that they tested in part by using SELDI data from the Wichita in-hospital study of CFS.

Abstract

A major problem for current peak detection algorithms is that noise in mass spectrometry (MS) spectra gives rise to a high rate of false positives. The false positive rate is especially problematic in detecting peaks with low amplitudes. Usually, various baseline correction algorithms and smoothing methods are applied before attempting peak detection. This approach is very sensitive to the amount of smoothing and aggressiveness of the baseline correction, which contribute to making peak detection results inconsistent between runs, instrumentation and analysis methods.

Results: Most peak detection algorithms simply identify peaks based on amplitude, ignoring the additional information present in the shape of the peaks in a spectrum. In our experience, ‘true’ peaks have characteristic shapes, and providing a shape-matching function that provides a ‘goodness of fit’ coefficient should provide a more robust peak identification method. Based on these observations, a continuous wavelet transform (CWT)-based peak detection algorithm has been devised that identifies peaks with different scales and amplitudes. By transforming the spectrum into wavelet space, the pattern-matching problem is simplified and in addition provides a powerful technique for identifying and separating the signal from the spike noise and colored noise. This transformation, with the additional information provided by the 2DCWT coefficients can greatly enhance the effective signal-to-noise ratio. Furthermore, with this technique no baseline removal or peak smoothing preprocessing steps are required before peak detection, and this improves the robustness of peak detection under a variety of conditions. The algorithm was evaluated with SELDI-TOF spectra with known polypeptide positions. Comparisons with two other popular algorithms were performed. The results show the CWT-based algorithm can identify both strong and weak peaks while keeping false positive rate low.

.

Page last modified on October 24, 2007


Topic Contents

• Topic Contents


Additional Navigation for the CDC Website

“Safer Healthier People”
Centers for Disease Control and Prevention, 1600 Clifton Rd, Atlanta, GA 30333, USA
Tel: 404-639-3311  /  Public Inquiries: (404) 639-3534  /  (800) 311-3435