Font Size: a A A

Research Of Algorithms And Applications On Transductive Transfer Learning

Posted on:2013-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:L T YuFull Text:PDF
GTID:2248330371997433Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
KDD (Knowledge Discovery in Database) is the technique that can discover the potential patterns from large scale of data. In the field of information technique industry and manage-ment, enterprises now are eager to convert the data and information to useful and applicable knowledge. As the most important tool of knowledge discovery, machine learning has been extensively used in customer relationship management, marketing prediction, risk analysis of credit card, etc. An important prerequisite of traditional machine learning is that the data sys-tems for prediction and existing data systems should have the same feature space, and their data distributions are supposed to be the same or identical. Under the circumstance of integrated da-ta, a good prediction model can be easily got by classical learning methods. Unfortunately, the data for knowledge discovery is often dynamic and disintegrated in reality, and these constraints have greate negative influences on the applications of data mining. When there are changes of data distributions in systems for prediction, it is both time and labor consuming if manually label the new instances and build the prediction model. As a branch of artificial intelligence, transfer learning utilizes the transferable potential knowledge which is discovered from several similar data systems, to improve the prediction performance, which has been attracted in several scientific fields like information retrieval, natural language processing, etc.In this thesis we first review the basic concepts and the development process of transfer learning, and introduce the latest research achievements, then we focus on transductive learning, searching for the algorithms and applications of the cross-domain classification. The contents include:(1) We propose a transductive algorithm RDA (Rule based Domain Adaptation), which utilizes the feature rules to reconstruct the domain representation, to realize the cross-domain classification problem. This algorithm first identifies the pivot features and non-pivot features via mutural information, then extracts the rules between these features through the observation of target domain, and rebuild the data representations of source domain. Finally a classifier is built on the new representation of source domain. RDA can avoid the large scale computation of matrix and make the transfer process in accordance with the human recognition. At the same time, it ensures the good performance of transfer classification. (2) In case of multiple source domains, we propose the CMS Algorithm (Collaborative Transfer Classification using Multiple Source Domains). This method utilizes the strategy of serialized optimization, which first reweights the samples from each source domain to reduce the distribution disparities, then a regression model is built according to the observation of the basic classifiers trained on all source domains through the collaborative effect. CMS model can simplize the step of parameter search in multi-task learning process, and can get the global optimum, thus the over-fitting in the target domain can be effectively reduced.(3) Under the circumstance of disintegrated data, we apply the SFA (Spectral Feature Alignment) algorithm on the prediction of customer churn, which can improve the predictive capability of the model.For the proposed algorithms above we conducted the experiments on several data sets, and demonstrated the effectiveness and superiorities. For the problem of customer churn prediction, we also verified the applicability of transductive transfer classification using business data sets.
Keywords/Search Tags:Transductive classification, Rules, Collaborative classification using multi-ple source domains, Customer churn
PDF Full Text Request
Related items