Font Size: a A A

Construction And Application Of Spectrum Management Analysis System Based On Ms Serum Peptide Group

Posted on:2010-04-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y CaoFull Text:PDF
GTID:1114360275462303Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
In the post-genomic era, with the completion of large-scale genome sequencing for human and model organisms, and a great breakthrough in the mass spectrometry, proteomics has made big progress in both basic research and clinical application. As a branch of proteomics, clinical proteomics focuses on the application of proteomics techniques in clinical medicine, which includes disease prevention, early detection, aiding therapy and so on. Many kinds of data are involved in clinical proteomics, and serum peptidome profiling is important one of them. It is a profile of proteins or peptides distributed in serum, which can be obtained via matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF/MS) or surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF/MS). Through comparison of peptide profiles between patient and control groups, we can find differentally expressed proteins or peptides leading to the development of a diseased condition at the protein level.Serum peptidome profiling shows a broad perspective in clinical studies, such as biomarker discovery, early detection and personalized medicine. Howerver, following issues should be considered before applying serum peptidome profiling in clinical studies.First, sample selection which has effect on the result of seurm peptidome profiling should be carefully assessed. We should consider the personlized difference among patients and control groups, such as age, sex, race, family history and medical history. Meanwhile, different stages of an individual must be indentified. To construct the mathematics models for disease diagnosis and model validation, the collection of sample information and the patient records linked to the samples should be comprehensive.Second, with a number of potential factors in the process of sample collection such as collection, transportion and storage, evaluating the effect on diagnostic sensitivity is important. Recording detailed information on collection, processing and storage of samples is crucial for both efficient reporting on biomedical study and subsequent data analysis.Finally, because there are much noise in the raw MS data from MALDI-TOF/MS or SELDI-TOF/MS, data preprocessing must be conducted.In view that the number of variables and the number of samples from the serum peptidome profile are very large, bioinformatics tools play a key role in discovering a set of peaks related to disease. Up to now, there are some projects for MS data management and analysis. However, few projects try to emphasize both the management of patients information and MALDI-TOF or SELDI-TOF MS-based statistical analysis. Here, we developed the flexible and compact software, BioSunMS, for MALDI-TOF or SELDI-TOF MS-based clinical proteomics study. BioSunMS was designed to support decission-making and allow patients information and spectra data to be stored, managed, processed and analyzed. The BioSunMS software had been tested with MS files of serum samples from patients with lung cancer and control groups. The whole paper is divided into the following four parts.1. Construction of the database for serum peptidome profileThe database is used to store the data from patients and control groups. The disease includes lung cancer, liver cancer, breast cancer, rectal cancer, prostatic cancer, leukaemia and so on. There are some tables for recording the information coresponding to the sample. The fields of the tables are sample source, clinical diagnosis, sample preprocess, detection methods, MS data and so on. Users can submit the data of serum peptidome profile to the database. There are many ways to query the database for spectra meeting desired criteria, such as research group, user, sample state, sample type, patient and characteristc description.2. Development of BioSunMS software for the ananlysis of serum peptide profile BioSunMS software includes two main modules, spectrum processing and MS profile analysis. The spectrum processing module performs spectrum import, spectrum export, and related preprocess such as calibration, normalization and peak detection. The MS profile module is designed for sample classification and identificaition of potential biomarkers. It includes feature selection and model construction to allow rapid automated analysis to identify potential biomarkers. 3. Sample class discovery and sample class prediction based on serum peptidome profileTo provide a platform for clinical researchers, we built a model based on the dataset of the database, using machine learning and statistical methods, such as SVM, PCA, GA, Na?ve Bayes, PLS and so on.4. Construction of a serum peptidome profile-based model for lung cancer The study was collaborated with the National Center of Biomedical Analysis. During the prelimilary research period, they collected and tested 1000 control samples and more than 2000 cancer samples by mass spectrometry. Among the dataset, there were 254 patients with lung cancer and correspondent 257 normal control samples. To construct the model for diagnosis of lung cancer patients using BioSunMS, we firstly collected 150 lung cancer samples and 150 healthy control samples as the training dataset. The remaining samples were used for test dataset, which contained 104 lung cancer samples and 107 healthy control samples. Then, the t-test was used to screen the peaks with statistical significance in training dataset, and 74 peaks were found. Finally, the method support vector machine (SVM) was used to construct model. The accuracy, sensitivity and specificity of the model on test dataset were 92.3%, 96.3% and 94.3%, respectively. The model has the potential application in early detection of lung cancer.In summary, we have developed the software BioSunMS, which integrates patients information and MS data storage, process, sample class discovery, sample classification and sample prediction in a single, user-friendly workbench. The project provides an additional solution to analyze hight-throughtput MS data of serum peptidome profile. Using BioSunMS, we also constructed an early detection model based on the serum peptidome profile for lung cancer. The present study finally provided bioinformatics support for the application of serum peptide profile in clinical studies.
Keywords/Search Tags:serum peptidome profiling, bioinformatics, diagnosis, software
PDF Full Text Request
Related items