Font Size: a A A

Study On Quantitative Regression Method For Organic Infrared Spectroscopy Based On Integrated Learning

Posted on:2021-01-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:W W JiangFull Text:PDF
GTID:1361330614959935Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Fourier transform infrared(FTIR)spectroscopy technology has been widely used in quantitative analysis of agriculture,industry,food,environment,pharmaceutical and other fields.Quantitative analysis is one of the core problems in the field of infrared spectrum analysis.The quantitative analysis of infrared spectrum is a method to establish a quantitative analysis model based on the obtained spectrum and its corresponding physical and chemical characteristics,and estimate the corresponding characteristics of the unknown spectrum through this model.In the process of quantitative analysis of FTIR infrared spectroscopy combined with chemometrics,the existence of abnormal samples will greatly reduce the stability and prediction accuracy of the model.The noise,no information or even interference wavelength in the complete spectrum will increase the complexity of the model,and may also affect the prediction performance of the model.In addition,in recent years,the development of deep learning algorithm provides a new way of thinking for the establishment of infrared spectrum quantitative model.This dissertation has carried out in-depth research on the above issues,and the main research work and achievements are summarized as follows:1)An improved Monte Carlo cross validation method is proposed to identify abnormal samples.For Monte Carlo cross validation,in the process of identifying abnormal samples by means of mean-variance diagram method,all samples are selected into the modeling subset with equal probability,and the threshold value is set by empirical value method.By changing the sample set range of Monte Carlo random sampling to ensure that only normal samples are used as modeling subsets,the recognition rate of abnormal samples is improved.At the same time,the suspicious abnormal samples screened by improved Monte Carlo sampling MCCV method are screened the 2nd time to reduce the misjudgment rate of normal samples.The experimental results verify the effectiveness of the improved Monte Carlo sampling MCCV method.2)A new wavelength selection algorithm MCUVE-SPA-MW,based on moving window for MCUVE-SPA,is proposed.In view of the fact that Monte Carlo Uninformation Variable Elimination(MC-UVE)cascade Successive Projections Algorithm(SPA)may result in isolated wavelength points,the algorithm is improved tokeep the continuity of the effective wavelength points by using the moving window to take the optimal wavelength as the starting point or center,so as to improve the accuracy of the algorithm prediction model.The experimental results verify the effectiveness of the algorithm.3)The correlation coefficient combined the Synergy interval partial least squares method(CC-Si PLS)is proposed.Aiming at the shortcoming of the Synergy interval partial least squares(Si PLS)algorithm that does not consider the variables irrelevant to the component information in the interval,the wavelength variables with high correlation with the target component information are selected firstly,and then the selected wavelength variables are selected by Si PLS to further simplify the prediction model.The experimental results verify the effectiveness of the algorithm.4)In view of the influence of spectral preprocessing on the results of wavelength selection algorithm,the influence of five preprocessing methods on the distribution of wavelength selection results and the prediction of the built model is studied.The results show that different preprocessing methods and wavelength selection algorithm have influence on the distribution and modeling effect of the selected wavelength.5)A quantitative regression model based on Blending integrated learning algorithm is proposed.In view of the fact that deep learning is rarely used in infrared spectrum analysis,Gradient Boosting Decision Tree,Linear kernel support vector machine,and Gaussian kernel support vector machine are used as the basic learners to form the algorithm fusion.The features of GBDT and support vector machine are fused,and the results are analyzed and compared with GBDT model and single kernel support vector regression model.The experimental results show that the integrated learning Blending model has strong applicability,high prediction accuracy and generalization ability.
Keywords/Search Tags:Fourier Transform Infrared Spectra, Spectral pretreatment, Abnormal Sample Identification, Wavelength Selection, Ensemble Learning, Quantitative Regression
PDF Full Text Request
Related items