The Graph-based Semi-supervised Learning With Missing Data

Posted on:2012-12-17

Degree:Master

Type:Thesis

Country:China

Candidate:W Wang

Full Text:PDF

GTID:2178330335995085

Subject:Probability theory and mathematical statistics

Abstract/Summary:

PDF Full Text Request

Missing data handling is often encounted in data analysis and machine learning,the usual practice is first to impute the data,such as mean imputation, KNN imputation, hot deck imputation, cold deck imputation, regression imputation,multiple imputation,then modeling in the completed data.However,imputaton is time-consuming and sometimes inappropriate imputation may cause large errors or false results,thereby affecting the subsequent analysis of the model.In this paper,we study the methods of treating missing data for classification,the aim is to constructing a classification model without imputation.We firstly combine Graph-based semi-supervised learning with missing data and construct a Graph-based semi-supervised learning model which can handle missing data automatically by constructing similar weights in missing data.Then,we realize our algrithom by R. Finally, I perform some exeriments in UCI data(including Letters,Spam,Diabetes,Wine,Segment).The experiment conclusion as follows:1:To deal with missing data using claasical statistic imputation(stochastic imputation, mean imputation,median imputation)fistly,then compare with Graph-based semi-supervised learning after imputation.The experiment results show that our method is slightly better than classical methods.2: Compare with classical supervised learning model(where data have none missing value) ,the proposed method (where data is incomplete by remove some data artificially) has similar results ,indicating that our methods is reasonable,which is very convenient (needn't imputation)when data contaning missing value.3: Compare with traditional methods(impute the data firstly,then model on the complete data), The experiment results show that our method is bettter than traditional methods,And our method do not fill missing data ,has a comparative advantage.

Keywords/Search Tags:

missing data, graph-based semi-supervised learning, imputation, similar weights

PDF Full Text Request

Related items

1	Studies On Missing Data Imputation
2	Research On Adaptive And Robust Missing Value Imputation Algorithm
3	Research On The Application Of Geometric Information In The Semi-supervised Learning
4	Comparative Study On Imputation Methods Of Missing Data In XGBOOST Model Under Complete Random Missing Mechanism
5	Learning with Low-Quality Data: Multi-View Semi-Supervised Learning with Missing Views
6	Nonparametric Imputation For Missing Data
7	Attribute Correlation Modeling And Missing Value Imputation Of Incomplete Data Based On Fuzzy Partition
8	Attribute Associated Neuron Modeling And Missing Value Imputation Based On Neural Network
9	Missing Value Imputation Based On TS Modeling With Alternate Learning
10	The Online Imputation Method Of Missing Value Based On KNN And Its Application In Credit Evaluation