Font Size: a A A

New Data Pretreating Strategies Research For Mass Spectrum In Proteomics

Posted on:2009-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:J R LiuFull Text:PDF
GTID:2120360242978367Subject:Analytical Chemistry
Abstract/Summary:PDF Full Text Request
As the performer of all life functions, protein plays a vital roll in biochemical reaction, genetic controlling, metabolism regulating and disease fighting. Proteomics, which aims at illustrating all proteins in a cell or organism, has become one of the most important realms of life science. Biological mass spectrum is crucial to protein profiling. Physical and chemical noises, isotopic ions spectrum, some unexpected ions which may resulted from irregular dissociation, absent ions and the deficits of the database searching algorithm will greatly decrease the utilization rate of experiment data. For much more sequence coverage, higher identification score and accuracy, some pretreating process must be carried out before protein identification by database searching.The aim work of this paper is to pretreat large scale MSMS data. In the first part, theoretic data and few part of experiment data was used. The effect of Ions distribution, LC elution time, intensity of the mass spectrum, integrity of b/y ions was studied. We proposed data pretreating strategies include: data filtering by half-decimal place rules and LC elution time, modulating the intensity of mass spectrum signal, complementing the absent b/y signals. In this part, we illustrated the feasibility and validity of these strategies just from a theoretic stage. In the second part, we discussed them from a larger data scale. As we know, all models were setted up according to an ideal state, but data from experiment adulterates amount of uncertain factors. In this part, we pretreated massive experiment data by strategies referred just now. Because of otherness of experiment data, we combined and optimized these strategies. At last, we got fine approaches. And the results showed that these approaches can greatly enhanced real existent proteins' identification score and accuracy, reduced the false positive identification. For a two-component system, the relative accuracy rate was stepped 33.3% up to 100%. For a eighteen-component system, the relative accuracy rate was enhanced to 58.82% from 31.25%.
Keywords/Search Tags:proteomics, MSMS, half-decimal place rules, LC elution time, precursor modification, strategies combining and optimizing
PDF Full Text Request
Related items