Font Size: a A A

Studies Of Attribute Reduction Based On Sample Processing Mechanism

Posted on:2021-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:W D ZhangFull Text:PDF
GTID:2428330611997363Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As one of the important approaches for intelligent information processing,in many current methods of data processing,rough set theory plays a fundamental role in the imitating of human brain's thinking and cognition.However,with the development of technology,the forms of data in real world applications become more and more complicated and diversification.The classical rough set theory is unable to deal with the real-world applications.Therefore,the use of rough set theory to deal with complex problems must start from the data and re-extend the related concepts of rough set.In this paper,some different viewpoints are considered,they are sampling of samples,binary relationship between samples,and the labels of samples.By studying rough data set model and attribute reduction in terms such viewpoints,the following research results are obtained:1.From the viewpoint of rough granular computing,neighborhood decision error ratebased attribute reduction aims to improve the classification performance of the neighborhood classifier.Nevertheless,for imbalanced data which can be seen everywhere in real-world applications,such reduction does not pay much attention to the classification results of samples in minority class.Therefore,a new strategy to attribute reduction is proposed,which is embedded with preprocessing of the imbalanced data.2.Attribute reduction is one of the core problems of rough set theory.Although it has rich semantic interpretation,this method may still bring overfitting.This overfitting phenomenon is different from overfitting in learning tasks,because the goal of attribute reduction is to find a subset of attributes that meet the given constraints or to sort the attributes,rather than training a learning model.Hence,the overfitting phenomenon caused by attribute reduction can be expressed as that the attribute subset obtained on the training sample meets the given constraint conditions,but the use of this attribute subset on the test sample may not meet the constraint conditions.Therefore,a truncated heuristic algorithm is designed which aims to mitigate or eliminate overfitting that occurs in attribute reduction.3.Constructing the pseudo label neighborhood decision-theoretic rough set for modeling.In decision-theoretic rough set,the decision costs are used to generate the thresholds for characterizing the probabilistic approximations.Similar to other rough sets,many generalized decision-theoretic rough sets can also be formed by using different binary relations.Nevertheless,it should be noticed that most of the processes for calculating binary relations do not take the labels of samples into account,which may lead to the lower discrimination;for example,samples with different labels are regarded as indistinguishable.To fill such gap,the main contribution of this paper is to propose a pseudo label strategy for constructing new decision-theoretic rough set.
Keywords/Search Tags:Attribute reduction, Decision cost, Overfitting, Pseudo-labeling, Rough set
PDF Full Text Request
Related items