Research On Feature Selection Algorithm Based On Rough Sets

Posted on:2014-07-18

Degree:Master

Type:Thesis

Country:China

Candidate:C W Li

Full Text:PDF

GTID:2268330401462537

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Rough set theory proposed by Poland mathematician Z. Pawlak is a soft computing tool for dealing with fuzzy and uncertainty data and is one of hot spots in artificial intelligence field. For its unique and innovative thinking, rough set theory has attracted much attention in recently30years. Many researchers successfully developed several generalized rough set models such as fuzzy rough set, dominance rough set, decision theory rough set and variable precision rough set. These models has been successfully used in widely field such as machine learning, pattern recognition, decision support, process control, knowledge discovery in database, expert system etc.Feature selection based on rough set, also called attribute reduction, is a key concept in rough set. It aims to retain the discernible ability of original features for the objects from the universe. When constructing predictive models, by removing redundant features, feature selection can improve model interpretability and enhance generalization. With the emergence large-scale data sets and high dimensions, the idea of feature selection is very significant for solving big data with high-value and low-value density.In this paper, existing efficient attribute reduction algorithms are analyzed. By selecting useful features from an ordered feature sequence, a new attribute reduction algorithm based on PageRank is proposed. In addition, a class library (RSLibrary) for rough set and preprocessing data is constructed. And a rough data analysis system is designed on the basis of RSLibrary. Main works of this paper is listed as follows:(1) Heuristic attribute reduction algorithms are analyzed and compared. The classical heuristic attribute reduction algorithm, accelerated reduction algorithms, two attribute reduction accelerated algorithms are specifically analyzed and compared.(2) A "global" attribute importance attribute reduction algorithm is proposed. By combining rough set theory and PageRank, this paper proposes the attribute sorting algorithm (AttributeRank), and then designs the attribute reduction algorithm based on attribute rank. By employing the parallel version and distributed systems, the new algorithm can get an ordered feature set efficiently.(3) A rough set data analysis platform is designed. A class library including attribute reduction algorithms and preprocessing data techniques is constructed. And a rough set data analysis platform is developed on the basis of RSLibraryAn overview of the main content and the direction of further research are given in the final part of the paper. The parallel version of attribute reduction provides evaluable ways for dealing with big data, new lessons for exploring efficient data mining techniques and promotes the development in the area of artificial intelligence.

Keywords/Search Tags:

Rough Sets, Feature Selection, AttributeRank, RSLibrary, Dissimilarity Coefficient of Attribute

PDF Full Text Request

Related items

1	Research On Feature Selection Based On F-neighborhood Rough Sets
2	Research On The Application Of Feature Selection Based On Rough Sets And Ant Colony Optimization Method
3	Research And Application On Feature Selection Based On Extending Of Rough Set
4	Study On Attribute Redution Based On Rough Sets And Its Application
5	Efficient Feature Selection Algorithm Based On Rough Set
6	Researches Of Rough Set Model And Feature Selection For Numerical Data
7	Matroidal And Topological Approaches To Rough Sets
8	Research On Algorithms Of Feature Selection Based On Rough Set
9	Research On Method Of Attribute Weight Based On Rough Sets Theory
10	Research On The Gene Selection Based On Rough Sets Theory