Research Of Preprocessing In Rough Set Based On Similar Prediction

Posted on:2012-09-04

Degree:Master

Type:Thesis

Country:China

Candidate:Z Jiang

Full Text:PDF

GTID:2218330368981942

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With mature of data mining technology, the information industries emerge in large numbers and Internet develops rapidly in daily life. The amount of information people required is growing exponentially. In practice, traditional data analysis and data query methods cannot meet the urgent requirements from people, because the potential knowledge hidden in data. As a new mathematical tool, rough set theory dose not requires any additional information or prior knowledge from the outside world. With the significant feature, rough set theory has gradually become the most important theory in the exploration in KDD. Classical rough set theory can not deal with the missing source data information, thus it needs to be preprocessed for data mining algorithm, how to conduct data preprocessing effectively is very important at present.This paper takes direct padding method and non-treatment method as data preprocessing in rough sets. First, review the feature and disadvantages of existing major padding algorithm, such as existence of redundant information system, the requirement of a priori probability distribution, no sparse degree dealing. Based on similarity computation, collaborative filtering technology is taken to deal with sparse information table, meanwhile combine this technology with Direction-area padding algorithm, a null-value estimation method in rough set based on similar prediction is improved here; second, entropy and mutual-information are introduced as a dual-feature weight to descript the property of information table, thus the padding value can show result with property; last, for the multi-value and no existing null-value, multi-value incomplete information system and a limited tolerance relation based on existing null-value are taken to deal with these problems in attribute reduction.In this paper, the improved algorithm is verified effective by a simulation, which is good at dealing with sparse data, and the accuracy and the mean absolute error are better than the original method if information table is sparse. The instances also verify that multi-value and no existing null-value are feasible in attribute reduction.

Keywords/Search Tags:

rough set, data preprocessing, null value, similarity, multi-value

PDF Full Text Request

Related items

1	Study On Data Preprocessing Based On Rough Set And Its Application
2	The Research On Data Preprocessing Based On Rough Sets Theory
3	VPRS Based Approaches For Discretization Of Continuous Attributes And Data Preprocessing
4	The Research Of Rough Set On Data Preprocessing
5	The Research On Data Preprocessing In Data Mining Based On Rough Sets Theory
6	Research And Implementation Of Intelligent Data Preprocessing System
7	Multi-granulation Rough Sets And Granular Reductions Based On Similarity Measure
8	Analysis Of Financial Data Based On Rough Set Theory
9	Filled The Default Data Based On Rough Set Theory And The Variable Domain Of Rule-based Reasoning
10	Commerce Data Mining Based On Rough Set Theory