Font Size: a A A

Development And Visualization Of New Methods For Qualitative And Quantitative Analysis Of Complex Systems Based On Chemical Spectra

Posted on:2024-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:X K LiFull Text:PDF
GTID:2531307166473374Subject:Materials and Chemical Engineering (Professional Degree)
Abstract/Summary:PDF Full Text Request
Chemical mapping is widely used in various fields such as food and drug analysis.However,when conducting complex system analysis,spectral overlap can occur,and the results obtained from traditional analytical methods may not be accurate enough.On the other hand,it is difficult for most analysts to establish models through code.To address these issues,this study improved traditional chemometrics methods to enhance their accuracy,and introduced advanced algorithms from artificial intelligence into the field of spectral analysis.Finally,a cross-software platform analysis system was developed.These research works are mainly divided into the following five chapters(from Chapter 2 to Chapter 6).Follow the introduction of Chapter 1.Chapter 2 Application and visualization of multiple linear regression model in quantitative analysis of EEM spectra.In the present study,an effective method has been proposed to quantitative determination of the target compounds on the basis of EEM spectra with partial overlap.In the proposed method,the mathematical models were established by combining stepwise regression and multiple linear regression method using the features that extracted from EEM spectra.The accuracy and reliability of the established model have been validated,demonstrating that the proposed method can realize the accurate quantitative analysis purpose.In order to facilitate the calculation easier,the authors have developed a friendly GUI.The GUI offers the procedures for data imput,model establishment,model optimization and results presentation.Chapter 3 A hybrid variable selection and modeling strategy for determination of target compounds in different spectral datasets.This study proposes a hybrid method for establishing quantitative models of target compounds in various spectral data sets.The method combines the i PLS to select variables and GD algorithms to establish a model.The accuracy of the method is optimized by adjusting the interval number of i PLS,learning rate and number of iterations of GD.The performance of the i PLS-GD method was evaluated using NIR,~1H NMR,and EEM datasets.The i PLS-GD method is compared to the i PLS method in terms of prediction ability,and the i PLS-GD method demonstrates better results.By combining the strengths of i PLS and GD methods,more accurate results can be obtained.Thus,the i PLS-GD method is a superior alternative for modeling complex spectral data.Chapter 4 Discrimination of storage years of Chenpi using GC-MS and MIR coupled with extreme gradient boosting method and release of a graphical user interfaceThis study is the first time to investigate the XGBoost algorithm coupled with MIR spectra and GC-MS chromatogram to discriminate Chenpi of different storage years(2-,4-,6-and 8-year).At first,the important features were extracted from first-order derivative MIR spectral data and GC-MS chromatogram data,respectively,and then the XGBoost discrimination model was established on the extracted important features.The results exhibited that 100%prediction accuracy and 100%classification accuracy was achieved for the analyzed samples,which confirmed the potential of the XGBoost method to discriminate Chenpi samples of different storage years.Moreover,an interesting phenomenon was found for the first time that with the extension of storage years of Chenpi samples that obtained from a same company,the relative content ofβ-Terpineol increased.Finally,an application equipped with a GUI was designed to discriminate storage years of Chenpi samples by using first-order derivative MIR spectral data and GC-MS chromatogram data with the XGBoost models.This study provided a new sight for the convenient and accurate discriminate of Chenpi samples of different storage years.Chapter 5 Using machine learning approaches and modified mid-level data fusion of GC and MIR for Chenpi geographical origin discriminationThe purpose of the present study is to provide an effective combination strategy for the geographical discrimination of Chenpi samples.For this purpose,39 Chenpi samples that collected from 8 different regions of Xinhui district(Guangdong,China)were analyzed by GC and MIR spectroscopy.At first,four machine learning methods including Adaboost,NB,KNN,and ANN were employed to establish discrimination models on the basis of GC and MIR data individually.Then,data fusion strategies including mid-level and a newly developed modified mid-level data fusion were applied for the combination of GC and MIR data.The discrimination performance of each established model was illustrated using confusion matrix.The results shown that data fusion strategies offer a obvious improvement in Chenpi discrimination compared with individual GC or MIR data,and KNN and ANN models on the basis of modified mid-level data fusion exhibit the best performances with only one sample was misclassified.The advanced machine learning methods in combination with the newly proposed modified mid-level data fusion strategy provide a new idea for the classification of Chenpi samples from different geographical origin.Chapter 6 The establishment and application of cross-software platform data analysis systemThis study developed an open-source,cross-software platform data analysis system.The system was written in Java Web language and can call algorithm models from other platforms.The system consists of five modules:login registration,user management,data management,data classification management,and data modeling.Users can upload data to the platform and perform graphical modeling operations.To verify the system’s feasibility,data was uploaded to the system for modeling and analysis,and accurate results were obtained.This system reduces the difficulty of data analysis and processing for analysts.
Keywords/Search Tags:Chemometrics, Spectral analysis, Machine learning, Data fusion, Visual interface
PDF Full Text Request
Related items