Font Size: a A A

Research On Feature Selection Algorithm Based On Rough Set Model Extension

Posted on:2021-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y WuFull Text:PDF
GTID:2428330620965563Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Rough set theory is a new mathematical tool that can effectively deal with incompleteness and uncertainty proposed by Polish scholar Pawlak in the 1980 s,and it has been widely used in various research fields.However,the existing classic rough set theory has defects in handling uncertain data and numerical data,which is not conducive to the processing of noisy data and maintaining the integrity of the internal structure of numerical data.Therefore,in order to solve the above problems,the expansion research based on rough set model has become a research hotspot of scholars at home and abroad.This paper mainly studies two extension models of classic rough sets:decision theory rough set model and neighborhood rough set model,and improves its corresponding feature selection algorithm,aiming to make the classification without changing the classification mechanism.The precision is more precise.The main research contents of this article are as follows:(1)This paper addresses the problem that the positive region in the rough set model of decision theory cannot change monotonously with the increase of attributes.Firstly,the decision rules are introduced according to the Bayesian decision theory to determine whether the object belongs to the positive area.Then on this basis,a new definition is proposed,that is,the positive region of the reduction set must not be lower than the positive region of the full set of attributes.Finally,a new feature selection algorithm is proposed in combination with heuristic search strategy.By comparing the experimental analysis results,it is concluded that the algorithm can satisfy the maximization of the positive area and has higher classification accuracy,thereby improving the efficiency of the algorithm.(2)Although the improved positive region feature selection algorithm has achieved good performance,it may not be able to directly handle the sample classification of mixed data in the positive region when placed in the neighborhood rough set model.In addition,when the neighborhood rough set model characterizes the classification ability of the attribute subsets,it also cannot describe the neighborhood of the mixed sample.Aiming at the above two problems,the follow-up of this paper focuses on the characteristics of the neighborhood rough set model.First,the advantages of the ? neighborhood and the k nearest neighbor are analyzed separately,and a new neighborhood rough set model is proposed by combining the two New induced information particles and the use of iterative strategies to calculate the upper and lower approximations.Then introduce the variable precision model into the improved neighborhood rough set model to process the noisy data.Finally,using forward greedy search strategy,an improved feature selection algorithm is designed.The analysis of the experimental results shows that this algorithm has a lower generalization ability and can effectively remove redundant attributes without reducing the classification accuracy.
Keywords/Search Tags:Decision theory rough set, Neighborhood rough set, Feature selection, Positive region, Greedy search strategy
PDF Full Text Request
Related items