Font Size: a A A

Research Of Sample Selection

Posted on:2018-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y LuFull Text:PDF
GTID:2348330536478209Subject:Engineering
Abstract/Summary:PDF Full Text Request
This paper based on machine learning and research on classifiers in two years,the data labeled cost and the data reliability cause that we always fail to obtain the ideal data in the real world.The data is always full of noise or sometimes is not labeled.We always want to untapped value of the data without the artificial intervention,which gave birth to the semi supervised learning and transfer learning.Semi-supervised learning methods are often adopted to handle datasets with very small number of labeled samples.However,conventional semi-supervised ensemble learning approaches have two limitations:1)most of them cannot obtain satisfactory results on high dimensional datasets with limited labels and 2)they usually do not consider how to use an optimization process to enlarge the training set.In this paper,we propose the progressive semi-supervised ensemble learning approach(PSEMISEL)to address the above limitations and handle datasets with very small number of labeled samples.When compared with traditional semi-supervised ensemble learning approaches,PSEMISEL is characterized by two properties:1)it adopts the random subspace technique to investigate the structure of the dataset in the subspaces and 2)a progressive training set generation process and a self-evolutionary sample selection process are proposed to enlarge the training set.We also use a set of nonparametric tests to compare different semi-supervised ensemble learning methods over multiple datasets.The experimental results on 18 real-world datasets from the University of CaliforniaThis paper also introduces the process of sample selection on transfer learning method.It specifies how to use the sample selection method to improve the accuracy of prediction using the source domain to the target domain.The comparative experiments confirm the effectiveness of this method,and we expound the concrete process of sample selection and the promotive effectiveness of optimization process.We make analysis of experiment to show that the suitable range of sample selection.
Keywords/Search Tags:ensemble learning, machine learning, optimization, random subspace, semi-supervised learning, transfer learning, sample selection
PDF Full Text Request
Related items