Font Size: a A A

The Study Of Classification Based On Granular Computing

Posted on:2022-10-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:C FuFull Text:PDF
GTID:1488306341986229Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Data classification is an important data processing method in the field of machine learning.Its main task is to establish a model that can reflect the mapping relationship between samples and labels by summarizing and learning a series of data samples with class labels.In the age of information,data usually contains knowledge and information from multiple levels and perspectives.Therefore,it is worthy of in-depth study on how to solve classification problems from the way that human beings deal with complex problems.Granular computing,as an emerging information processing method,can simulate the way of human cognition,analysis,and processing of problems to deal with classification problems,and can help users explore some multi-level and multi-perspective knowledge hidden in the data.Therefore,this paper conducts research on data classification based on the framework of granular computing.The main contributions are as follows:(1)For the problem that traditional information granulation methods can only perform granular classification modeling with a single information granularity,we propose a classification modeling method with multiple information granularities.When faced with diversely distributed data,the information granule constructed with a single information granularity may not be able to capture the essential characteristics of the data.In order to solve the above problem,this paper introduces the concepts of coverage and specificity of information granules,and the information granulation problem with multiple information granularities is transformed into a boundary optimization problem of information granules constrained by the information granularity level variable.By solving the optimization problem with different information granularity values,the information granules at their corresponding granularity levels can be obtained.Subsequently,we involve the proposed information granulation method with rule-based classification methods and construct a classification model based on an information granule expression with multiple information granularities.The numerical experimental results show that the constructed model can achieve a balance between classification accuracy and the simplicity of classification rules.(2)For the problem that the overall characteristics of the data are easily overlooked in the field of multi-dimensional data classification,this paper proposes a data classification algorithm based on the union hypersphere information granules.When facing multi-dimensional data,human beings would refer to the overall distribution characteristics of the data in an intuitive perspective to make a classification decision,but some common classification algorithms often ignore this kind of modeling idea and only focus on how to improve the classification accuracy of the model.In order to solve this problem,we design the information granule expression for multi-dimensional data,viz.,"hypersphere information granule" and "union hypersphere information granule" and propose a union hypersphere information granule-based classification algorithm.Considering the idea of multiple information granularities,the corresponding union hypersphere information granule is constructed around each class of multi-dimensional data,and the classification model is constructed by regarding the union hypersphere information granule and its corresponding class label as the condition part and the conclusion part of the corresponding classification rule,respectively.The experimental results show that each union hypersphere information granule constructed in the modeling process can describe the overall distribution characteristics of its corresponding class,which not only ensures that the model can have a classification accuracy better than some classic classification algorithms but also makes the model more intuitive and easier to understand.(3)For the problem of imbalanced data classification,we propose a classification method based on granular description.Common imbalanced data classification methods often preprocess the data according to the imbalanced ratio,that is,the proportion of the majority class and the minority class sample,before modeling,which causes the destruction of the original data to a certain extent and ignores the essential characteristics of the data.In order to solve the above problem,we first study the granular description of the data,and then proposed a bottomup information granule fusion method based on the granular description.By constructing two union information granules with a big disparity on the majority class and the minority class of imbalanced data,the distribution features of the data samples(especially the ones of minority class)are described and captured.In addition,we also use the Minkowski distance with different parameters to calculate the distance involved in the information granule fusion process to explore more detailed features of imbalanced data samples.Numerical experimental results show that the proposed classification method can construct a reliable classification model for imbalanced data without the aid of preprocessing methods such as data resampling and feature selection.
Keywords/Search Tags:Information Granules, Multi-Granularity, Union Hypersphere Information Granules, Granular Description, Granular Classification Model
PDF Full Text Request
Related items