Font Size: a A A

Research On Imbalanced Classification Problems In The Framework Of Transfer Learning

Posted on:2018-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:D M BaoFull Text:PDF
GTID:2348330518977686Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a new framework of machine learning,transfer learning relaxes the two basic assumptions of traditional machine learning and has received more and more attention in recent years.The existing work about the imbalanced classification problem under the framework of transfer learning mainly focuses on single source transfer research.The potential problem is that the information is less,and may even produce“negative transfer”.To overcome the shortcomings of the existing research on the imbalanced classification problem under the framework of existing transfer learning,this thesis introduces the multi-source transfer mechanism,and then launches the research on multi-source transfer learning.First of all,to solve the binary classification transfer learning problem with similar data distributions and class imbalance between positive and negative samples in the target and source domains,we present an integrated transfer learning algorithm for multi-source imbalanced samples classification.We try to avoid the negative transfer problem by utilizing multi-source domains,and propose the new sample weights initialization and weights updating strategies to solve the class imbalance problem.Moreover,we propose a new elimination mechanism to eliminate the redundant samples in the multi-source domains,and then the time and memory costs of the classifier could be significantly reduced.Experimental results on standard UCI datasets show that the proposed algorithm outperforms the state-of-the-arts transfer learning algorithms in terms of F1-measure and AUC evaluations metrics.Secondly,to improve the time efficiency of the MSTUSC algorithm,the thesis proposes a distributed integrated transfer learning algorithm for multi-source imbalanced samples classification,i.e.DMSTUSC.A distributed system is introduced,and each source domain is divided into one node of the distributed system.The integrated transfer learning algorithm for single-source imbalanced samples classification is trained on a single node,and the classification model is obtained.Finally,the trained classification model on each node is integrated to obtain the integrated transfer learning algorithm for multi-source imbalanced samples classification.The experimental results show that the time complexity of DMSTUSC algorithm is obviously reduced compared with MSTUSC algorithm.
Keywords/Search Tags:transfer learning, class imbalance, multi-source, integrated classifier, distributed system
PDF Full Text Request
Related items