Research On Feature Selection Based On F-neighborhood Rough Sets

Posted on:2021-04-22

Degree:Master

Type:Thesis

Country:China

Candidate:Z X Deng

Full Text:PDF

GTID:2428330611490820

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In the era of big data,more and more features are obtained in production practice.Some attributes may be redundant or unrelated to the classification task,and they need to be deleted before further data processing.Feature selection(also known as attribute reduction)is a technique used to reduce features in order to find the best subset of features to predict the sample category.Whether for single-label data or multi-label data,there is a key problem in the feature selection process: feature evaluation.For multi-label data,existing works are not enough to consider the relationship among labels,which has negative impact on the performance of multi-label feature selection and the effect of multi-label learning.To solve the above problems,this thesis combines the advantages of neighborhood rough sets and F-rough sets,and proposes a new rough set model,called F-neighborhood rough set.Then,F-neighborhood rough sets are applied to single-label feature selection and multi-label feature selection.The main research contents are listed as follows:First,combined the advantages of neighborhood rough sets and F-rough sets,F-neighborhood rough sets are proposed.The neighborhood relationship in F-neighborhood rough sets is defined,and neighborhood decision subsystems are used to represent different situations.Their properties are discussed.Further,F-attribute dependency and attribute significance matrices are constructed for feature evaluation.A feature selection algorithm is designed,which is based on two evaluation criteria.Compared with the state-of-the-art algorithms,experimental results show that our algorithm has great advantages.Secondly,the F-neighborhood rough set model is extended from single-label learningto multi-label learning.F-neighborhood rough sets decompose multi-label data into a family of single-label decision tables.Then the attribute dependency of the family of single-label is constructed for information fusion,in which the relationship among labels are fully considered.Multi-label feature selection is performed with attribute dependency of multiple decision tables and a matrix of attribute significance.Compared with the state-of-the-art algorithms,experimental results show that our algorithm has great advantages in both text and image multi-label learning tasks.The main innovation points of this thesis are listed as follows:(1)F-neighborhood rough set model is proposed.This model has the advantages of both neighborhood rough sets and F-rough sets.(2)A feature selection algorithm(NPRMS)based on the attribute significance matrix is proposed.NPRMS is not only suitable for discrete data,but also for continuous data.It is not only suitable for static data,but also for dynamic data.NPRMS has good robustness.(3)Under the situation of multi-label data learning,a feature selection algorithm(FNPRMS)based on attribute significance matrices is proposed.FNPRMS inherits the advantages of NPRMS,and fully considers the relationship among labels.It does not need to perform spatial conversion and has good comprehensibility.

Keywords/Search Tags:

Feature selection, neighborhood rough sets, F-rough sets, multi-label learning, attribute significance matrix

PDF Full Text Request

Related items

1	A Study On Attribute Reduction Based On Neighborhood And Fuzzy Rough Sets
2	Research And Application On Feature Selection Based On Extending Of Rough Set
3	Feature Selection Algorithm For Multi-label Learning
4	Research On Multi-label Text Classification Methods Based On Rough Sets
5	Researches Of Rough Set Model And Feature Selection For Numerical Data
6	The Model Of ?-? Neighborhood Rough Sets And Its Applications
7	Research On Feature Selection Algorithm Based On Rough Sets
8	Mixed Data Mining Methods Based On Rough Sets Theory
9	Feature Selection Of Information Systems Based On Neighborhood Toleranc Rough Sets
10	Research On Mixed Data Knowledge Acquisition Method Based On Neighborhood Multi-granularity Rough Sets