Font Size: a A A

Research On Structured Data Classification Based On Information Granules

Posted on:2022-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:P F LiFull Text:PDF
GTID:2518306575465624Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The structured data classification has always been a key research issue in the field of machine learning and data mining.Whether it is industrial manufacturing or daily life,classification problems are common.The traditional classification research is mostly based on independent data samples.However,in fact,there are correlations between the attributes of different sample data,and the similarity of the attributes leads to different topological structures in the data distribution,which can be solved from the perspective of granular computing.The information granule has important practical significance in constructing the granular space classification problem.Therefore,from the perspective of granular computing,this thesis has done the following two aspects of work on the construction of granular classification model for the structured data classification problem.On the one hand,in order to improve the generalization ability and efficiency of the ensembl classification model,a self-sampling ensemble classification method based on attribute reduction was proposed based on the research of granular computing and attribute reduction,aiming at the extra memory overhead and computation time caused by the huge integrated system.This method applies the strategy of combining ant colony optimization and attribute reduction to the original feature set,and then several optimal granularity feature spaces are obtained.Selecting different granular spaces as the feature input of the classification model can reduce the memory consumption and computation time of the classifier to a certain extent.Then each base classifier is iteratively trained by a self-sampling method which takes the learning result and learning speed of samples as constraints.On the other hand,in view of the problem that large-scale data cannot be completely loaded into the single memory to complete classification in the era of big data,this thesis proposes a large-scale data classification model based on information granularity fusion.The classification model with the method of detailed read from the data set to get the data,according to the differences between each of the categories of data into different data buffer,fuzzy c-means clustering algorithm to the partial data graining forming local information and local information of each type of fusion guided by the principle of reasonable size structure of high-level information granule,to characterize the partial view of the original data set.Then the global granularity hypersphere was constructed by combining all kinds of information granules to describe the key structural features of corresponding data,and the granularity classification model based on rules was obtained.In this thesis,the artificial datasets and public datasets are taken as examples to carry out experiments and analyses on the proposed method,and the experimental results are compared with those in other literatures.It is verified that the proposed method is effective in constructing the classification model and method based on information granules.This research provides a new way to solve the structured data classification scenario.
Keywords/Search Tags:information granule, attribute reduction, ant colony optimization, fuzzy C-means clustering, reasonable granularity principle
PDF Full Text Request
Related items