Font Size: a A A

Research On Terahertz Spectral Feature Extraction And Recognition Based On Correlation Vector Machine

Posted on:2017-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y W ZhongFull Text:PDF
GTID:2358330488450192Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Terahertz time-domain spectroscopy is an excellent non-contact detection technology without damage since its low photon energy and not causing photo ionization when passing through a material.THz spectrum in different compounds or crystals corresponding has different overall characteristics,and it can be utilized in substance identification and physicochemical analysis.In recent years,with the continuous breakthrough of terahertz radiation source technology,coupled with detection equipment miniaturization and cost reduction,the technology of terahertz spectrum transmission detection have been widely used in biotechnology,food safety,pharmaceutical,analytical chemistry,and other fields.At the same time,with the spectrums of various materials constantly accumulating,more and more researchers have collected such terahertz spectrums and established databases wich are exposed to use.However,because of the vibration frequency of nonlocal structure of macromolecules belonging to THz,such as secondary helix of protein molecules and torsional vibration,phonon vibration and skeleton vibration of co-crystals,it can be easily affected by external environmental fluctuations and the change of structure resulting that the overall waveform changes.It is possible that multiple sets of data under the same condition differ from each other and the local feature is not prominent,which is not conducive to study the correspondence between physicochemical properties and local graphical features of the spectrum.Therefore,for the task of typical simple extraction and global feature extraction from spectral data sets,this paper conducts the following works of three aspects:(1)We collect spectral data from three public foreign terahertz databases,using web clawers and manual ways.Besides,some common materials'spectrums are detected with our own THz-TDS system,such as saccharose,NaCL,starch,etc.After that,we normalize all the spectroscopy for further analysis using S-G filter,interpolation and resample method.(2)A global graphic feature extraction method which is based on kernel optimized relevance vector machine and differs from traditional local feature extraction methods is proposed.We extract features form three kinds of materials with distinct local features and global features,and reconstruct the fitting model of their own curve.To contrast with RVM,the same procedures on epsilon-SVR algorithm are performed.The result shows that relevance vector machine with E-M optimized kernel parameters of basic functions is very suitable for feature extraction task on terahertz spectroscopy.Sparse representation of each curve has achieved,and the quantity of features is controllable.The regression models structuring by these features are capable of preserving the global characteristic of spectral curves and fitting effect of each band is more consistent.Besides,the feature points can be used as eigenvector to find out the relationship or similarity between spectrums of different materials.(3)A multi-class and standard spectrum extraction metthod on terahertz spectroscopy is proposed,which is based on multi-kernel and multi-class relevance vector machine.A multiclass identification model for 7 kinds of simples is constructed and the typical spectrum from every class is extracted.For comparison,C-SVM is also applied to multiclass identification training and support vectors extracted by C-SVM are used in construct such model.To verify the typicality of relevance samples,we utilize k-means to generate cluster centers of every class,then analysis the closeness between relevance vectors and centers.After that,for dimensionality reduction,we use PC A to demonstrate all the samples in 3D space so that the relative position of cluster centers can be measured and relevance samples can be observed intuitively.The result shows that the multiclass-RVM is not only capable of constructing well-classified model,but also achieves the probabilistic output of classification results by introducing the probabilistic auxiliary matrix,which is better than conventional multi-classification algorithm.Moreover,the extracted relevance vector obtains certain typicality,which can be used as typical spectrum in graphic analysis between different materials.
Keywords/Search Tags:Terahertz spectrum, Kernel optimize, Feature Extraction, Multiclass, RVM, Standard spectrum
PDF Full Text Request
Related items