Font Size: a A A

Modelling Of Near-infrared Spectroscopy Based On Semi-supervised Learning And Transfer Learning

Posted on:2013-04-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y HeFull Text:PDF
GTID:1220330377452877Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In today’s era, the science and economy have developed rapidly. The level of theproduction process is becoming more and more automatic and intelligent. Thetraditional means of product quality control can not meet the needs of productdevelopment and control. Near-infrared spectroscopy (NIR) has emerged as a newdetection technology with fast and efficient features. It can greatly improve theefficiency of product quality supervision and management, and now it is widely usedin the oil, medicine, tobacco industry, and etc.By summarizing the conclusion of early subject named as “Intelligent Methodsof Sensory Evaluation”, it shows that building a sensory evaluation model with goodclassification performance is difficult when the input of the model is chemicalcomposition with incomplete information. The number of components is oftenlimited by detecting with traditional methods of chemical analysis, but near-infraredspectral contains a wealth of component information. The near-infrared spectroscopyalways applied to detect the chemical components of products. In this paper, kinds ofmachine learning methods are applied in the research field of near-infraredspectroscopy. The features of cigarette quality and the components of cigaretteblends is mined from the near-infrared spectroscopy. The high-dimensional spectraldata is directly applied to build the relationships model between product quality andspectra.Many practices show when NIR is applied to resolve the prediction problem ofcomplex component under the environment with large background noise, thetraditional analysis technology of near-infrared spectroscopy will always need a largeamount of samples, and the model always has poor performance and stability, and it isdifficult to transfer. So the existing modeling technology of NIR should be improved.This paper begins with analyzing the basic principles of modeling of NIR indomesticand aboard research status, semi-supervised learning and transfer learning are introduced for modeling method system of NIR under the inspiration of transductivereference. The paper describes the four key technologies about the high-dimensionaldata processing of NIR, qualitative analysis of spectral, quantitative analysis ofspectral and model transfer. The main content of this paper is as:1) When the relationship between the observation data and near infrared spectra(NIR) is nonlinear, traditional dimension reduction methods can easily cause to losecharacteristic information, destroy manifold structure and decline the performance ofclassifier and so on. In this paper, a novel algorithm based on semi-supervised kernelneighborhood protection embedding is provided, named as SSKNPE. The kenerldistance is used to transform nonlinear problem into the linear problem in newfeatures space. It can take advantage of the prior classified information from labeleddata, and constraint feature mapping, so the data is mapped from high-dimensionspace to low-dimension space with preserving global structure and local structure.The experimental results show that SSKNPE can effectively improve theclassification ability after dimension reduction. The SSKNPE algorithm is applied tosolve the the cigarette brand identification problem based on the near-infraredspectrum.2) Because of the inductive reference mechanism, the traditional classifier has theproblem of large prediction risk and training sample number. In this paper, weintroduce transduction reference and semi-supervised learning, and provide a novelsemi-supervised and support vector machines based on affinity propagation clustering(APS4VM). The low density area is found from many large margins by combiningaffinity propagation clustering with chaos optimization. The method can find asupport vector decision surface which can classify samples safely. Though there arefew labeled data in the iris data set and the taste evaluation data set of cigarettes, thegood performance of classifier is obtained based APS4VM algorithm. Sosemi-supervised support vector machine has practical value in engineering application.It is suitable for building the qualitative model of cigarette taste evaluation.3) The quantitative analysis model based on traditional regression method can notperform well when it meets complex nonlinear problems, particularly when the training samples are not enough. In this paper a novel semi-supervised support vectorregression algorithm (QPSO-LSS3VR) is provided which is based on quantumparticle swarm optimization. The unlabeled samples are estimated by combining theK-nearest neighbor and confidence selection method. The best parameters(γ, λ,σ) ofthe semi-supervised support vector regression model are found by QPSO algorithm.The experiment result of predicting total sugar of cigarette shows that this algorithmcan effectively reduce the standard error of prediction and the cost of modelling whenthere are few training samples, and the algorithm can be applied to build the model ofsugar content prediction of cigarette.4) The problem is that model versatility between near-infraed spectrometers isbad. The existing model transfer methods are analyzed its non-applicability: thepreparation condition of standard samples for modeling are required harshly, and thepractice operation process is complex, and the model has low prediction performanceafter using traditional statistical model transfer methods.In this paper, a new modeltransfer algorithm of near infrared spectral is innovately provided. It includes the ideaof transfer learning and similarity sample distance metric, named as SM-TrBoostEns.NIR spectral is projected into a low-dimensional space by nonlinear dimensionreduction method. According to metic the similarity between samples, the knowledgetransfers by that useful samples are selected for modeling on the target instrument,and the model is transferred with the combination of transfer boosting technology andensemble learning. The experiments of predicting cigarette total sugar by transferringmodel between two NIR instruments shows that the algorithm can still effectivelyincrease the regression precision under the condition that less standard samples’spectrals are collected on target instrument, so this algorithm has a certain practical.The experiments also shows that transfer learning can be explored and improved indepth for applying to the model transfer of NIR.5) At last, the research conclusion and innovation is summarized. In the future,more relevant content will be researched, such as the output confidence of semi-supervised learning model, abnormal spectral sample discrimination based onconvex hull, and features wavelength filtering and so on. All results will help us tobuild a new framework of quality evaluation of product based on the near infraredspectra.
Keywords/Search Tags:Near-Infrared Spectroscopy, Semi-Supervised Learning, TransferLearning, Neighborhood Protection Embedded, Quantum-Behaved ParticleSwarm Optimization, Affinity Propagation Clustering, Ensembling Learning
PDF Full Text Request
Related items