Font Size: a A A

Research On Identification Method And Algorithms Of Core Fucosylation Based On Mass Spectrometry Data

Posted on:2022-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y J SuFull Text:PDF
GTID:2480306602967069Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Glycosylation is one of the most complex and common post-translational modifications of proteins.It is mainly divided into two categories: N-linked glycosylation and O-linked glycosylation.Among them,the core fucosylation is a 1,6-fucose modification that only occurs on N-linked glycosylation.Current studies have found that only fucosyltransferase can catalyze core fucosylation.Core fucosylation plays a role in immune response and the occurrence,development,recurrence,invasion and metastasis of cancer cells.In addition,core fucosylation protein can be used as an important tumor marker for early detection and analysis of cancer.type.Therefore,there is an urgent need to research and develop a set of methods that can automatically identify core fucosylation.However,due to the phenomenon of fucose migration,existing methods cannot distinguish between core fucosylation and fucose terminally connected to Nlinked glycosylation with fucose migration,resulting in lack of core fucosylation mass spectrum sample data.In this context,this study regards the identification of core fucosylation as a single classification problem.Build core fucosylation identification model using the Mapping Convergence algorithm and the autoencoder model respectively,with the mass spectrum data containing only non-core fucose is obtained by removing the mouse brain tissue of FUT8.The innovative results of this study are as follows:1.Establish the characteristic ions required for core fucosylation identification.In this paper,through the analysis of the inherent structure of the N-linked glycosylation structure and the connection mode of the fucose in the core fucosylation,the core fucosylation and non-core fucosylation mass spectrometry identification common Y ions are determined and the characteristic ions required for the core fucosylation identification is established.Then perform statistical analysis and correlation analysis on the relative intensities of the characteristic ions in the mass spectra of core fucosylation and non-core fucosylation to obtain their differences in the mass spectra characteristics of core fucosylation and noncore fucosylation mass spectrum data.2.Based on the difference in of core fucosylation and non-core fucosylation,a core fucosylation identification method based on Mapping Convergence algorithm is proposed.First,train the support vector data description model by including only the non-core fucosylation mass spectrum sample data,and learn the characteristics of the non-core fucosylation mass spectrum data.Second,use the support vector machine model to iteratively compare the core fucosylation and the core fucosylation in the feature space and the sample data of the non-core fucosylation mass spectrum is divided and trained.Then,the finally obtained support vector machine is used for core fucosylation identification.Finally,the experimental results are analyzed and the advantages and disadvantages of the core fucosylation identification model based on the Mapping Convergence algorithm are discussed.3.Based on the lack of Mapping Convergence algorithm in solving the core fucosylation identification problem,a core fucosylation identification method based on autoencoder is proposed.First,the core fucosylation identification problem is regarded as anomaly detection.Then an autoencoder is trained to learn the characteristics of non-core fucosylation and the threshold is determined according to the reconstruction error of the training set.And the trained autoencoder and threshold are used for core fucosylation identification Finally analyze and discuss the advantages and disadvantages of the core fucosylation identification model based on autoencoder.4.Based on the existing identified mouse brain mass spectrum data,verify the two methods proposed in this article.And conduct comparative experiments on the two methods to analyze the impact of missing feature values in the training set on the two methods.In summary,in view of the influence of fucose migration on the identification of core fucosylation,this study analyzed the characteristics of core fucosylation and non-core fucosylation mass spectrum data,and proposed corresponding methods to improve accuracy rate of core fucosylation identification.The model constructed by the method proposed in this paper can accurately identify the core fucosylation,which provides a theoretical basis and method for the diagnosis and treatment of related diseases.
Keywords/Search Tags:Mass spectrum data, Fucose migration, Core fucosylation, Mapping Convergence algorithm, Autoencoder
PDF Full Text Request
Related items