Font Size: a A A

Analysis And Identification Of High Resolution Mass Spectrometry Data Of Natural Products Based On Deep Neural Network

Posted on:2021-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y ZhaoFull Text:PDF
GTID:2404330647455488Subject:Pharmacy
Abstract/Summary:PDF Full Text Request
In order to accelerate the development of modernization and internationalization of traditional Chinese medicine(TCM)and achieve the exchanges of multi-disciplinary and cross industry,it is necessary to expand the depth and breadth of the integration of science and technology with TCM.Therefore,it is necessary to deeply strengthen the integration of multi-disciplinary cutting-edge technologies such as systematic biology,big data,artificial intelligence and TCM.The strategy of“Modernization of TCM”has been implemented for more than 25 years,and the industry of TCM in China has entered a new era of rapid development.The basic research on effective substances,mechanism of activities,quality control,pharmacokinetics and safety of TCM have been carried out comprehensively.Among these studies,the material basis has always been the first basic problem to be solved for TCM.Because of the complexity and variety of chemical components of medicinal plants,mass spectrometry has become the most efficient research method.At present,the mass spectrometer has strong characteristics on structure identification,high sensitivity,wide scan range,fast analysis speed,and high compatibility with chromatograph,however,due to the large number of manufacturers and various models,the universality of laboratory database has become a difficult problem for researchers.In recent years,with the development of artificial intelligence,it has become a key technology in the field of machine learning to use deep neural network to study and analyze large amounts of data.Deep neural network technology,based on artificial neurons,builds deep learning models for different tasks,which can learn its deep internal laws and expressions from a large number of dataset,therefore,it can reach or even exceed the level of human in some tasks.The establishment of machine learning algorithm based on deep neural network technology can provide a new solution for data automatic recognition and database universal matching.Inspired by this,this thesis uses the deep neural network model to learn the internal difference of mass spectral data between flavonoids and benzophenones,and establishes a neural network model to distinguish these two kinds of natural products.Flavonoids and benzophenones have similar structures,molecular weights and MS cleavage pathways.When using high resolution MS data to calculate and check in database,they often have confused results.In this thesis,flavonoids and benzophenones are studied to explore the automatic recognition method of deep neural network technology in compound classification.UHPLC-Q-Orbitrap MS was used to analyze 50 randomly selected standards,including 25 flavonoids and 25 benzophenones.Combined with the response surface methodology,the parameters of liquid chromatography and mass spectrometry were optimized.Finally,the optimal conditions were as follows:the column was separated by Waters ACQUITY UPLC HSS T3(2.1×100 mm,1.8μm),gradient eluted by acetonitrile-0.1%formic acid,the flow rate was 0.2 m L·min-1.The column temperature was 30°C,and the capillary temperature was 200°C.The helium heating temperature was 400°C.The positive and negative ion spray voltage were 3.2 k V and 2.8 k V,respectively.After obtaining the optimal conditions,133 standards were analyzed by LC-MS,including 84 flavonoids and 49benzophenones.The LC-MS data of 133 standards were extracted by Xcalibur 4.0 software with 44 dimensions,including retention time in the positive and negative ion mode,parent ion m/z and 20 secondary mass spectral fragments in the positive and negative ion mode,respectively.The criteria on the selection of positive and negative MS2data is the top 20strength.In order to conduct the modeling analysis of deep learning,the retention time,parent ions,and secondary mass spectrum fragments obtained from each compound in the mode of positive and negative switching scan were spliced together as the input characteristics of the model.In addition,the data of 113 standards,including 74 flavonoids and 39 benzophenones were trained and verified by using the deep feedforward neural network,therefore,the neural network model can learn the ability to distinguish the two types of compounds.Finally,the neural network model was used to test the classification performance of 20 standards(10flavonoid standards,10 benzophenone standards).In order to provide better input features for the neural network model,based on the data of 113 standards,this paper analyzes32-dimensional(precursor ions in positive and negative modes and their 15 MS2fragments),42 dimensions(precursor ions in positive and negative modes and their 20 MS2fragments),44 dimensions(retention time in positive and negative modes,precursor ions and 20 MS2fragments).Comparing the three cases,deep feedforward neural network models with different input feature dimensions are obtained.The experimental results show that 42dimension input features can achieve the highest accuracy of classification,and the mean80%accuracy of classification was achieved in the test of 20 standards.In order to further verify the deep learning neural network model,this paper used Compound Discoverer 2.1 software to automatically extract the data of two mango leaf samples which were extracted by different solvents,and obtained 42 dimensional high-resolution mass spectrometry data of 102 compounds.The model was used to classify and identify them,and 46 flavonoids and 26 benzophenones are obtained.18 flavonoids and12 benzophenones were identified by mz Cloud database and compared with standards.Among them,10 flavonoids were found in mango leaves for the first time,while 28flavonoids and 14 benzophenones were not identified,these results indicated that they might be new compounds.These results suggest that this method has the characteristics of high sensitivity,accuracy and automation,and is suitable for the rapid discovery of new compounds in TCM.In conclusion,this thesis uses response surface methodology to optimize LC-MS conditions,and uses UHPLC-Q-Orbitrap MS instrument to analyze and collect data for 133standards.Then,a rapid method for distinguishing flavonoids and benzophenones was established by using deep neural network technology,with a mean accuracy rate of 80%.The results basically verifies the feasibility of the technology.In addition,the accuracy of the model can be improved with the increase of the number of standards,suggesting that deep learning technology is suitable for the rapid distinguishing of complex samples of TCM.Finally,we identified the compounds in different extracts of mango leaves,and the results indicated that the technology has great practical value in analysis and distinguishing different kinds of compounds,and also has a certain guiding role in phytochemistry research.
Keywords/Search Tags:deep neural network, UHPLC-Q-Orbitrap MS, flavonoids, benzophenones
PDF Full Text Request
Related items