Study Of Completing Missing Data

Posted on:2012-01-19

Degree:Master

Type:Thesis

Country:China

Candidate:C M Jin

Full Text:PDF

GTID:2178330332496988

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Missing data is a popular problem that permeates much of the modern research work and areas of investigation being done today. It will make the analysis much more different, cause unrealizable results, and decrease the efficiency of the whole statistical program. Especially in the full observation and not fully observed differences between the systems of the circumstances, the use of conventional statistical methods to incomplete data sets made by the results, is not a substitute for the overall. Traditional techniques for replacing missing data may have serious limitations. Recent developments in computing allow more sophisticated techniques to be used.Data Mining (Knowledge Discovery from Database) is a process to mine available, credible, valid and comprehensible pattern from large-scale data in an intelligent and automatic way. Data reinforcement is one of the most important directions in Data Mining. This paper just introduces the theory of the imputation of missing data:1. Describes the research background, research status and classification of missing data mechanism; and explained the basic concept of missing data imputation.2. This paper compares the efficacy of four current and promising methods that can be used to deal with missing data. This efficacy will be judged by examining the percent of bias in estimating parameters.3. The focus of this paper is on new relationship matrix model. The new relationship matrix records all the situations that similarities or differences fort comparing the condition attributes and the decision attributes between objects. Based on it, mines the potential links between objects, and completes the missing data. Results will not undermine the system's coordination.4. There are 2 group experiments to validate the algorithm. Experiment One compared the recovery rate of the mean method, the conditional mean method and this paper's algorithm by processing three data sets in the UCI. Experiment Two mainly focus on the completing accuracy under different deletion. The study involves seven levels of incomplete data.

Keywords/Search Tags:

data reinforcement, new relationship matrix, incomplete information table, rough set, collision avoidance

PDF Full Text Request

Related items

1	Rough Set Theory Of Incomplete Information System & Its Action In Data Process
2	Research On Matrix Algorithm For Attribute Reduction In Incomplete Decision Table
3	Research On Algorithm For Attribute Reduction Table Based On Enriching Discernibility Matrix In Incomplete Information Systems
4	Research On Attribute Reduction Algorithm In Incomplete Decision Table Based On Rough Set Theory
5	Rough Set Approach To Data Mining In Incomplete Information Systems
6	Rough Set Theory And Method Research In Incomplete Data Analysis
7	Researched For Processing Approach Of Incomplete Information System Based On Rough Set Theory
8	The Application Research Of Rough Set In Data Mining Of Incomplete Information System
9	Research On An Approach Of Incomplete Information Processing Based On The Rough Set Theory
10	Incomplete Information System Based On Rough Set Attribute Reduction Method Of Research