Study And Implementation On Feature Selection Algorithms In Large Data Sets

Posted on:2006-09-27

Degree:Master

Type:Thesis

Country:China

Candidate:Y J Liu

Full Text:PDF

GTID:2168360155458071

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Data mining is a rising database technique with the development of database and artificial intelligence in the recent several years. The objects it handles are a large amount of the ordinary business data, with the purpose of extracting some worthy knowledge or information from these data. Data mining algorithms generally have the certain request to the datasets, such as good integrality, less redundancy, small relevance between the attributes (features). However, the data in actual system always have incompletion, redundancy and illegibility, and seldom directly satisfy the request of data mining algorithms. Moreover, there are a lot of insignificant ingredients in massive actual data, which seriously affect the efficiency of the data mining algorithms, and the noisy data will result in the invalid induce. The data preprocessing have already become the key issues in the process of implementation of the data mining systems.The data preprocessing is an important part of data mining, and is absolutely necessary. As an important step of data preprocessing, feature selection has already become a very hot topic. Especially, to the large datasets composed of a large amount of records and a lot of irrelevant features with the data mining tasks at hand, the application of feature selection becomes more important.The theory of rough set is a mathematical tool for characterizing the imprecise, uncertainty and all kind of incomplete information. It can efficiently analyze and deal with all kinds of the underlying information, whatever imprecise, inconsistency and uncompletion. And it can find underlying knowledge, discovering the potential rule. In recent years, it is a hot topic that the research on the theory of rough set and its algorithm in the field of data mining. The reduction algorithm is one of key problems. Therefore, there are many investigations about reduction algorithms.In this thesis, we briefly introduce feature selection problem and rough sets model, and research feature selection algorithms based on rough sets model. The traditional rough sets model didn't combine with the relation database system and all the intensive computational operations are performed in flat files, rather than take advantages of the...

Keywords/Search Tags:

data mining, feature selection, rough set, genetic algorithm

PDF Full Text Request

Related items

1	Research On Feature Selection Algorithm Based On Rough Set Theory
2	Research On Rough Set Theory Based Data Mining Algorithm
3	Model And Algorithm Of Analyzing Data Based On Rough Set Theory
4	Application And Research Of Large Database Mining Based On Rough Set And Genetic Algorithm
5	Research For The Application Of Feature Selection On TCM Data Mining
6	A novel approach to data mining: Genetic algorithm for feature selection
7	Research On The Gene Selection Based On Rough Sets Theory
8	Based On Rough Set Data Mining Method
9	Research On Feature Selection Algorithm Based On Rough Set Model Extension
10	The Research Of The Feature Selection And Cluster Algorithms In Data Mining