Font Size: a A A

Research And Application Of Attribute Reduction Algorithm Based On Neighborhood Rough Set

Posted on:2021-08-24Degree:MasterType:Thesis
Country:ChinaCandidate:D LiFull Text:PDF
GTID:2518306725452354Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Rough set is a mathematical theory that can handle inaccurate information.After years of research and development,it has been widely used in pattern recognition,data mining and other fields.Neighborhood rough set is an extended model of rough set.It generalizes the equivalent relationship to neighborhood relationship,effectively solves the disadvantage that rough sets can only handle discrete data,and expands the application range of rough set theory.The attribute reduction is the core content of neighborhood rough set.It can delete redundant attributes while maintaining the classification ability of the information system.Attribute reduction not only reduces the data usage of the storage space,but also improves the decision-making ability.The outline of this thesis is as follows: the neighborhood radius is used as the most important setting parameter of the neighborhood rough set.Firstly,the importance of neighborhood radius is proven.Then,combined with many studies,the setting method of multi-neighborhood radius and single-neighborhood based on attribute standard deviation in different application scenarios are given.Compared with the empirical method,these two methods can effectively reduce the setting error of the neighborhood radius.Finally,the rule of the neighborhood radius is given.This study provides a theoretical support for setting a neighborhood radius on variable precision neighborhood rough set.Then,compared with the traditional neighborhood rough set,the variable precision neighborhood rough set can better handle noise data than it.But it still has shortcomings,the improvements are as follows:(1)In variable precision neighborhood rough set,the measurement effect of traditional attribute measurement function is not good,and the attribute with the best classification capability cannot be selected.In view of this problem,a metric function based on the attribute quality is proposed.The metric function takes the average division accuracy in the neighborhood as the quality factor based on the attribute importance,and comprehensively considers the division accuracy in the neighborhood and the positive region after adding an attribute.Finally,an attribute reduction algorithm based on attribute quality is proposed.Multiple sets of comparative experiments prove that the algorithm can select an optimal attribute subset when changing the accuracy,which improves the quality of attribute reduction.(2)There are a lot of unnecessary calculations in the positive region calculation,which leads to a large algorithm time cost.In response to this problem,an absolute positive region is proposed,which proves the monotonic relationship between the positive region and the subset of attributes.Then,by sorting the samples,the measurement samples are reduced,and a decision classification strategy is adopted for the metric samples to reduce unnecessary metric calculations.Finally,a fast attribute reduction algorithm based on decision classification is proposed.Multiple sets of comparative experiments prove that the algorithm reduces unnecessary metric calculations and improves the speed of attribute reduction.Finally,a combination of these two algorithms is used to design an attribute reduction model based on variable precision neighborhood rough set.This model not only can accelerate the speed of attribute reduction,but also selects the best subset of attributes.This model is applied to the classification of English spam,and it is mainly used to delete the redundant attributes of the feature word sets to improve the classification performance of spam classifier.Under different variable precision thresholds,experiments show that the attribute reduction model can effectively improve the recognition rate of spam,it has a certain practical significance.
Keywords/Search Tags:Variable Precision Neighborhood Rough Set, Attribute Reduction, Neighborhood Radius, Attribute Quality, Decision Classification
PDF Full Text Request
Related items