Font Size: a A A

Research On Transductive Transfer Learning With Self-training

Posted on:2014-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:K B HuFull Text:PDF
GTID:2268330401988929Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the real-world applications, it is difficult to obtain the corresponding labels due to the fast emerging of mass data, such as Web reviews, online transactions and so on. Hence, existing data mining algorithms face great challenges. Therefore, the study of transfer learning has been widespread concerned. This is because transfer learning can obtain and reuse knowledge. It is beneficial to learn a new task from existing tasks, while it is not limited in the independent and identically distributions.We focus on the transductive transfer learning based on self-training oriented to the sentiment classification of reviews. The main contributions are as follows:(1) We first give a general overview of transfer learning, including the necessity of its emergence and development background, the main problems and classification, meanwhile, we also summary the related work and applications;(2) Second, we propose the MDACD algorithm. It aims to make better use of the source domains while address the negative impact of the relatively "poor" source domains. Our MDACD algorithm dynamically tackles the source domains, which benefits adapting to the target domain much more. Meanwhile, it uses the class distribution to select source domains, partly eliminating the negative impact from "poor" source domains. Extensive experiments show our algorithm is more effective compared to baseline algorithms;(3) Third, to solve the negative impact on transfer learning from the relatively "poor" instances in source domains, we propose the MAIR algorithm. It reconstructs each instance in the target domain by multiple instances in the source domains. Meanwhile, it makes full use of the instances related to the target domain in source domains, and avoids negative effects from the relatively "poor" instances in source domains. Extensive experiments demonstrate that our MAIR algorithm is superior to several existing approaches in the classification accuracy and time overhead;(4) Last, for better application of our algorithms, we design a prototype system for the cross-domain sentiment classification oriented to the sentiment classification of reviews. The system integrates online data accessing and our algorithms, which can achieve good results in the real-world applications.
Keywords/Search Tags:data mining, transfer learning, sentiment classification, self-training
PDF Full Text Request
Related items