Font Size: a A A

Research On Seed Identification Method Of Genetically Modified Agricultural Products Based On Terahertz Time-domain Spectroscopy

Posted on:2023-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:S TuoFull Text:PDF
GTID:2530307022456254Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
Recently,with the increasing global demand for genetically modified agricultural products,traditional chemical detection technology such as protein detection and DNA detection,have the problems of long time,easy damage to the sample structure,the need for a large number of detection reagents and high analysis cost,which can no longer meet the needs of modern agricultural seed selection.As a kind of electromagnetic spectrum,Terahertz wave has the advantages of low energy,strong penetrability and fingerprint spectrum.Therefore,the Terahertz time-domain spectroscopy based on modern pattern recognition theory has become a powerful tool for non-destructive testing.In this paper,the seeds of transgenic agricultural products are taken as the research object,and the Terahertz time-domain spectroscopy is used to detect the spectral data of eight different types of transgenic cotton seeds and three different types of transgenic soybean seeds,obtaining their raw spectral data.On this basis,based on Variational Mode Decomposition(VMD)and Wavelet Transform(WT),a preprocessing method for Terahertz spectral data is firstly proposed,which solves the problems of oscillation and indistinct peak-to-peak positions in spectral data.Further,based on the t-distributed Stochastic Nearest Neighbor Embedding(t-SNE)method,an improved fuzzy C-means clustering model based on t-SNE is proposed to realize the unsupervised clustering identification of three kinds of transgenic soybean seeds.Finally,based on the multiClass Relevance Vector Machine(m RVM)theory,an improved Multi-Class Relevance Vector Machine(m RVM)classification model is proposed to realize the supervised classification and identification of eight transgenic cotton seeds.The main research contents and results of this paper are as follows:(1)Aiming at the problem of low performance adaptability of traditional spectral data preprocessing methods,this paper proposes a terahertz spectral data preprocessing method based on VMD and WT,and uses VMD to decompose the original time-domain spectral data of transgenic seeds,and selects the effective IMF components related to the original time-domain spectral data to reconstruct the time-domain spectral data of transgenic seeds by the criterion of Pearson’s correlation coefficient.And the corresponding frequency domain data and absorbance data are calculated according to the reconstructed spectral data.And then the absorbance data is processed by WT.After the WT processing,the absorbance curve is smoother.And the absorption peaks are more visible,essentially eliminating the adverse effects of water vapor in the air,uneven samples,and light scattering.(2)Aiming at the problem that the traditional fuzzy C clustering method is easy to fall into the local optimum,based on the THz spectral characteristics of transgenic soybean seeds,this paper proposes an improved fuzzy C unsupervised clustering method based on t-SNE to identify and analyze transgenic soybean seeds.Firstly,the absorbance spectral data of transgenic soybean seeds are normalized.Next,feature extraction is performed on the standardized data to obtain a feature matrix,and t-SNE is used to reduce the dimension of the feature matrix.Then,this paper selects the appropriate clustering center according to the principle of maximizing the distance between classes,and uses the fuzzy C-means method for clustering.This method can not only solve the phenomenon of overcrowding between classes in the clustering process,but also reflect the distance information between classes so as to select appropriate clustering centers for samples.The effectiveness of the proposed method is verified by comparing with the results of Principal Component Analysis(PCA),Local Linear Embedding(LLE),Local Preserving Projection(LPP)and traditional fuzzy C clustering methods.(3)Aiming at the problems that the kernel parameters of Support Vector Machine(SVM)are difficult to select and Relevance Vector Machine(RVM)is difficult to adapt to multi-type classification,an improved multi-kernel Relevance Vector Machine(Im RVM)is proposed to perform supervised classification and identification of transgenic cotton seeds.The Im RVM proposed in this paper uses genetic algorithm to optimize the kernel parameters of m RVM,and adaptively selects the optimal kernel parameters of m RVM according to the characteristics of terahertz spectral data of transgenic cotton seeds.Through the comparative analysis with m RVM,RVM,SVM and other methods,the actual effect of this method is verified.
Keywords/Search Tags:Terahertz time-domain spectroscopy, Transgenic seeds, Variational Mode Decomposition, Wavelet analysis, t-SNE, Fuzzy C Clustering, Improved multi-kernel Relevance Vector Machine
PDF Full Text Request
Related items