Font Size: a A A

Research Of High Performance GC-MS Data Analyzing Algorithm

Posted on:2012-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:L Q LiFull Text:PDF
GTID:2248330338493136Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Because of the strong robustness, high sensitivity, long detection range, etc, the GasChromatography - Mass Spectrometry, which is believed to be the important way to analysisand detect the complex samples, is widely used in many research fields, such as food safe,medicine, biochemical, and so on. At present not only the core detection devices of home-made mass spectrometer are relatively laggard, but also its software and data processingalgorithm are in their infancy, yet it is in the long range planning of China.In the raw data processing steps of GC-MS(including de-noising, baseline, future de-tection, resolving overlapped peaks, aligning retention time, data mining, etc), the resolvingoverlapped peaks and aligning retention time are most challenging and time-consuming,which are eager for accurate auto-algorithm. The existing analysis algorithms of mass spec-trum has become the bottleneck of the application of its technology, which can hardly meetthe requirements of effective management and fast analysis mining in huge GC-MS data asa result of its shortage such as unitary function, slowly computing, un-automated and singlesample analyzing limited, etc.In order to resolve those above problems, a new strong integrated intelligent architecturefor data processing technology of GC-MC was proposed in this paper. Here are those mainworks:(1) Given GC-MC data description and existing algorithm analysis. First of all, theauthor analyzes the mathematical description and their physical meanings of GC-MC data,XIC and TIC, and then introduces the raw data processing ?ow and matching algorithms ofGC-MC. At the end, a summary of free software for GC-MC data processing was given.(2) Proposed a new algorithm for resolving overlapped peaks in GC-MC which is calledDV-MCR. This algorithm was birth to resolve the problem that the MCR-ALS is not easyto accurately confirm the principle component numbers and initial matrix, which was ver-ified by many experiments in many simulative conditions, of which results shows that thisalgorithm could have a good analyze result comparing with MCR-ALS.(3) This thesis used the dynamic programming algorithm to align the retention time.The author designed a new algorithm for computing the similarities of characteristic peaks,in which dynamic programming methods were used to align the retention time, and provedthe efficiency of this algorithm by the experiment data.(4) A common multi-threaded parallel cross-validation framework (Parallel-CV) wasproposed in order to resolve the problem that the computation of cross-validation is a very large, very time-consuming work when it is used for estimating the performance of machinelearning model. This efficiencies of this framework was verified by many experiments usingSVM and PLS algorithms with many datasets in different sizes.(5) Propose CloudChem, a cloud computing based software solution that provides Chemo-metrics app service. It is constructed on SaaS module and powered by parallel computingtechnology, It solves the weak points existed in traditional Chemometrics software. The ap-plication service platform can performs high speed, efficient computing, highly integratedstorage, analyzing and mining to spectra, chromatography and mass spectrometry. And thisextremely reduce the cost on infrastructure and software for clients who focused on Chemo-metrics fields.
Keywords/Search Tags:Mass Spectrometry, Gas Chromatography, Multivariate Cure Resolution, Alignment, Parallel-CV, Cloud Computing, Parallel Computing, CloudChem
PDF Full Text Request
Related items