Font Size: a A A

Study On Feature Hierarchical Selection And High Performance Classification Model

Posted on:2019-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:L Q PeiFull Text:PDF
GTID:2428330542994304Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent decades,the processing technology of image has been widely used in medical,military,agricultural,industrial and other fields.In the era of information explosion,the amount of raw data is so large that even if feature extraction is performed,the dimensions of the features are still not low enough so that the information processing speed is not fast enough.For the smart tobacco leaf grading,the correct classifying accuracy and speed of the sample will directly affect the application of the intelligent grading system in practice.An effective method to effectively improve the classification recognition accuracy and reduce the classification time is not only to search for a classification model with better performance but also to screen the features.Therefore,the specific research work in this dissertation is as follows:1.Perform low-level processing on the image.The transmission images of 1588 tobacco leaves belong to 41 grades were collected in 2016.Then background is segmented,noise is removed,and features are extracted and normalized for all the samples.2.Select the best tobacco grading model.there are 5 classification models are applied and studied: SRC,SVM,RF,AdaBoost,and BP.Before feature screening,the grading model is compared first.The original 104 features are used in the experiments.The results show that the SRC has the highest recognition accuracy,followed by SVM,and RF has the fastest recognition speed.RF and SVM are selected as the fitness functions during feature screening respectively considering the model establishment time,classification speed and correct accuracy.3.Screen features by Hierarchy selection.The first layer: the number of features is reduced by the dispersion ratio algorithm from 104 initial features to 80;the second layer: IPSO,SGA,AGA,LAGA,ILAGA and CCGA are performed to obtain better feature combination set respectively;finally the frequent feature sets are formed by calculating the support degree and confidence degree of the features,and the final feature combination is selected using an adaptive algorithm based on the support degree and confidence degree.4.The experimental results show that the proposed method of feature combination hierarchy screening not only can improve the grading accuracy and but also lower tremendously the grading time.28 features were screened by using RF as the fitness function.The grading correct rate can reach 89.42%,and the classification speed is 12,500 piece/s.A feature subset with 28 features can also obtained when SVM is used as the fitness function,and its grading accuracy reach to 89.42%,speed is 3,333piece/s.
Keywords/Search Tags:Tobacco, Random Forest, SVM, Genetic Algorithm, Particle Swarm Algorithm, Frequent Itemsets
PDF Full Text Request
Related items