Font Size: a A A

Machine learning and statistical approaches to the quality classification of tandem mass spectrometry

Posted on:2007-02-19Degree:M.SType:Thesis
University:University of Southern CaliforniaCandidate:Mo, LijuanFull Text:PDF
GTID:2451390005989180Subject:Biology
Abstract/Summary:
In this thesis, we present the machine learning method as well as statistical regression method to access the quality of Tandem Mass Spectrum.; Machine learning methods are used to classify the MS/MS spectra into good and bad quality spectra and also selected useful features for statistical regression model. The performances of different machine learning methods are compared and Random Forest Method was found out to give the best performance with low overfitting results. Also the bias and variance for different machine learning methods are analyzed.; Stepwise regression procedure is then applied to select the variables to be used in the linear regression model. The results show that the model fits the data quite well and thus provide us a way to predict the quality of tandem mass spectrum by its spectral features.
Keywords/Search Tags:Machine learning, Tandem mass, Quality, Statistical
Related items