Font Size: a A A

Application Of Chemometrics Technology Into Studies On Liver Diseases

Posted on:2009-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:K XuFull Text:PDF
GTID:2144360272490122Subject:Analytical Chemistry
Abstract/Summary:PDF Full Text Request
A lot of data can be obtained from analyzing liver tissues by biological Mass Spectrometry. In this article, Chemometrics technology was applied to analyze these data to get some significant results and information. Liver is one of the most essential human organs. It plays a very important role in human body, which can not only detoxicate, decompose bacterium and alcohol, but also take charge of metabolism and digestion. However, alcoholism, unhealthy diet and dirty water will induce the formation of liver diseases which do badly damage to human health. The components of diet and living habits of Chinese people lead to high rates of liver diseases in China. These diseases, including Hepatitis A Virus, Hepatitis B Virus, Hepatocellular Carcinoma, Hepatocirrhosis, etc., killed thousands of Chinese people every year. It's very significant for early prevention, detection and clinical physician of liver diseases. In this article, a widely-used Chemometric method was applied to analyze Mass Spectrometry data of serum protein which had different kinds of liver diseases. The classification models of these diseases were established to distinguish different sources and to select many significant biomakers which may be deeply related to the formation, metastasis and treatment of liver diseases. The improved Partial Least Squares (PLS) variable selection method was used to research liver diseases practically. We made a new program by VBA according to the approach. The aim of the research was to develop a Chemometric method which can analyze biological data effectively. The results of the study could be reliable foundation of the research of liver diseases. Moreover, the applications of Chemometrics in analysis of biological data were developed.Chapter 1 was the general introduction of the developing history and research fields of Chemometrics, regular pattern recognition and variable selection. Based on the summary of the harm, development and research methods of liver diseases, the purpose, significance and main contents for the paper were summed up.Main principle of the PLS regression and PLS variable selection method were demonstrated in Chapter 2. The illation, principle, etc. of the method were presented. The information including regression coefficients etc. from PLS modeling was used to select original regression variable, to eliminate some unimportant or uninformative variables and to obtain simpler models without loss prediction power.From Chapter 3 to Chapter 5, the applications of PLS variable selection in liver diseases were introduced:(1) The PLS method was applied to deal with the SELDI-TOF-MS data obtained from serum protein of HBV (Hepatitis B Virus) patients and normal people. The classification model was built for serum protein from different samples. The cross validation relative coefficient (CR) of the model was 0.9745. This method could provide a reliable result for distinguishing sources correctly. The classification figures in the article were plotted more clearly and intuitively. Furthermore, the important factors or variables that discriminate HBV patients and healthy people were found by analyzing the model. The variables were several peak intensities of protein from some m/z sections, which expressed the upregulation or downregulation of protein in the sections. As potential biomarkers, the proteins may be closely related to the formation of HBV, which can be deeply studied.(2) The improved PLS variable selection method was applied to build the prediction model which was used to classify HCC (Hepatocellular Carcinoma) patients and healthy people. The CR value of the model came over to 0.96 and each sample was distinguished correctly. The model selected 30 variables which were related to HCC. The protein from these m/z sections will up-regulate or down-regulate during the formation of HCC. Besides, the classification figures constructed by the fitting value of the model in the article were clear and intuitive, and expressed the discrimination effect of the model well.(3) The SELDI-TOF-MS data of serum protein from HCC patients with the portal vein tumor thrombi (PVTT) and without it all came from the Liver Cancer Institute, Fudan University. Using two kinds of PLS variable selection methods, the classification models were established and their CR values were 0.9553 and 0.9404. It proved that the two methods both were suitable for the analysis of the data base on 'CR~n' figures. The variables from the models were related to the formation of PVTT. Compare with decision tree classification algorithm, the PLS methods had more powerful predictive ability and gave more significant potential biomarkers, which were applicable for analysis of the data from SELDI-TOF-MS.Conclusions and future prospects for this research were summarized in the last chapter. The PLS variable selection method was improved and promoted for the research of liver diseases. Several classification models were established to distinguish samples from different sources and some significant potential biomarkers were selected. The results of all the practice applications indicated that, the PLS variable selection method was very suitable to deal with the problem with huge amount of data from biological mass Spectrometry.
Keywords/Search Tags:Liver diseases, SELDI-TOF-MS, PLS variable selection, Classification model, Huge amount of data modeling
PDF Full Text Request
Related items