Font Size: a A A

Research On Feature Selection Based On Label Enhancement And Its Application In Leaf Species Recognition

Posted on:2023-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y S XiongFull Text:PDF
GTID:2543306803465344Subject:Agriculture
Abstract/Summary:PDF Full Text Request
With the development of the era of big data,in various application scenarios,the dimension of data is higher and the semantics are richer.Feature selection is an important data preprocessing step in machine learning and data mining.It selects a representative feature subset and remove redundant,noisy features.Label distribution learning,as a new learning paradigm,also suffers from the curse of dimensionality.In multi-label learning,each instance is associated with multiple labels,and the importance of each label is the same.Each instance in label distribution learning is also associated with multiple labels,but the importance of each label is not necessarily the same.At present,aiming at the problem of the curse of dimensionality in labeled distribution learning,further research is needed.In addition,utilizing label enhancement algorithm transform multi-label data into label distribution data,thereby enhancing the supervision information of the data.Therefore,this paper focuses on the algorithm of label enhancement and label distribution feature selection,and applies label enhancement and label distribution feature selection to the identification of plant leaf species.The research in this paper is as follows.First,this paper proposes a new label distribution feature selection algorithm for label distribution data.The algorithm utilizes sparse learning,a feature similarity measure,and a label correlation measure.Sparse learning uses the norm,which can make the solved parameters approach row sparse.Then,based on the feature similarity measurement based on the theory of granular computing,by measuring the local feature similarity in the neighborhood granularity of each sample,the corresponding vectors in the parameter matrix of the features with high similarity are closer.In addition,the Pearson correlation coefficient is used to measure the correlation between the labels,and the Euclidean distance is used to measure the correlation degree of the labels.Finally,the effectiveness of the proposed algorithm is verified,by compared with five mainstream algorithms on twelve public datasets and six evaluation indicators.In the real world,label distribution data is very difficult to label,so most of them are multilabel data,so utilize label enhancement algorithm enhances the multi-label data,and then obtain the label distribution data.In reality,the importance of each label may be different,but most of the existing multi-label feature selection algorithms consider the importance of the label to be the same,so the label enhancement algorithm is used to convert the multi-label data into a label distribution data,in order to increase the supervision information of the data.On the augmented label distribution data,the correlation between label is used for feature selection.Finally,the selected feature subset is input into the multi-label classifier,so as to improve the classification performance.The algorithm designed in this paper is verified on fifteen multi-labeled data,and it shows effectiveness in comparison with six mainstream feature selection algorithms on six evaluation indicators.Finally,the feature selection algorithm based on label enhancement algorithm proposed in this paper is applied to plant leaf species recognition.The model can be divided into four steps:data processing,label enhancement,feature selection and classification.Leaf data is multi-class data,which can be processed into multi-label data through data processing.Then,the label enhancement framework,based on deep forest,is used to transform multi-label data into label distribution data to enhance the supervised information of data,and then feature selection is carried out to select the features with high importance.Finally,the classification is performed on multiple classifiers to verify whether the feature selection based on label enhancement can improve the classification accuracy of the classifier.Experimental results show that the model presented in this paper is effective in leaf species identification in most cases.
Keywords/Search Tags:label distribution learning, multi-label learning, feature selection, label enhancement, plant leaf classification
PDF Full Text Request
Related items