Font Size: a A A

The Research On Semi-supervised Feature Selection And Its Classification In Bean Diseases

Posted on:2023-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:L J WanFull Text:PDF
GTID:2543306803965409Subject:Agriculture
Abstract/Summary:PDF Full Text Request
Agriculture plays a vital part in our economic system,however crop diseases are one of the reasons that restrict "agricultural efficiency,grain production and farmers’ income".There are only more than 1,400 types of diseases.And outbreak of diseases will lead to unhealthy social and economic development.Currently,the scale of beans is increasing in agricultural crops,and their cultivation not only needs to face the test of the environment,but also needs to deal with the problem of sharp drop in yield caused by a large number of pests and diseases.It takes a lot of manpower,material resources and other resources to label the kinds of pests and diseases.The people who plant leguminous plants are mainly farmers.When the types of diseases cannot be identified precisely.Therefore,it is helpful for those with insufficient expertise to identify the types of diseases in classification research.With regard to the involuted situation,semi-supervised feature selection methods are adopted to categorize the soybean data.The classification of real-life bean datasets under the semi-supervised learning circumstance is taken on the research as a breakthrough.The key issues are focused on feature selection,semi-supervised learning,and the classification research of bean diseases.In regard to the feature selection mehtod,a semi-supervised feature selection framework for the classification of legume diseases is constructed.The experimental results prove the effectiveness and feasibility of the algorithm in the public data set UCI.The main contributions of this thesis is as follows:1.Aiming at the few labeled data in semi-supervised data,a new semi-supervised feature selection method is proposed based on the theory of granular computing.The framework effectively removes the redundant features and improves the classification performance.Given the above complex circumstances,based on the granular computing theory,a novel semi-supervised feature selection method is proposed by combing consistency and knowledge granularity.In the proposed framework,the dependence of the positive region is used to measure the consistency of the labeled instances,and the knowledge granularity is used to measure the distinguishability between the feature to the sample space by performing data granulation in the unlabeled data.On the basis,combined with the data distribution,a measure of feature importance in semi-supervised learning is proposed.Finally,the superiority of our proposed method over other state-of-the-art methods is demonstrated by conducting comprehensive experiments with the eight data sets in UCI,the experimental results prove the effectiveness and feasibility of Semi-CG on semi-supervised data.2.Considering better mining of the feature information of unlabeled data,a novel semi-supervised feature selection framework(GSSF)is proposed based on the indiscernibility matrix.In the proposed framework,the discernibility matrix is utilized to explore the relevance between feature and label spaces under the supervised information in labeled data,and the indispensable features of each decision class are selected.Furthermore,an evaluation metric is presented to evaluate the significance of features on semi-supervised data,while the importance of features on unlabeled data is measured by the mutual information between the features and the indispensable features.To compare the classification performance of the proposed method with four state-ofthe-art semi-supervised feature selection methods.Extensive experiments conducted on UCI data sets demonstrate that the effectiveness of the proposed algorithm in semisupervised data.3.Based on GSSF method,different proportions of labeled data and corresponding unlabeled data is the input of the experiment in the research.The experimental results show that the algorithm compares other methods have excellent classification performance under feature selection results on the real data set of beans in the public data set UCI.To sum up,for solving the classification research of bean diseases in practice,some new semi-supervised feature selection methods are proposed.The methods combine the research contents of semi-supervised learning,granular computing theory and application analysis.Through the experimental analysis,the classification performance of bean pests and diseases classification research mainly is depended on the classification algorithm and feature selection method.Among them,accurate information can improve the recognition accuracy and classification performance in semi-supervised data,and an effective feature selection algorithm can reduce its computational complexity.
Keywords/Search Tags:semi-supervised learning, bean diseases, feature selection, granular computing, identification matrix
PDF Full Text Request
Related items