Font Size: a A A

A Study On Classification And Feature Selection Based On Transfer Learning And Its Application

Posted on:2016-04-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:T G NiFull Text:PDF
GTID:1228330464465521Subject:Light Industry Information Technology and Engineering
Abstract/Summary:PDF Full Text Request
Recently, feature selection and pattern classification, which are two of important tasks in pattern recognition, are attractting more and more attentions of the researchers. With the development of transfer learning technology, the range of traditional pattern recognition techniques is widely broadened. Many research results are used in data mining, image processing, speech recognition, fingerprint classification and medical diagnosis. However, feature selection and pattern classification based on transfer learning methods show drawbacks of low robustness and weak generalization ability to a certain extent. In order to solve the above problems, several issues are addressed in this dissertation:1. A new Relief feature weighting algorithm based on transfer learning is proposed. First, a new margin-based objective fuction integrating transfer learning information is proposed within the optimzation. Then by applying optimization theory, several useful theoretical results are derived from the proposed objective function. Finally, a set of transfer Relief feature weighting algorithms are developed for two-class data and multi-class data. As demonstrated by extensive experiments in artificial and UCI datasets, the proposed algorithms show the competitive performance to the state-of-the-art algorithms.2. A classification method for evolutionary data streams is proposed. By utilizing the similarity criterion between the data distribution within the adjacent evolutionary windows and the related knowledge of counterexamples effectively, a enhanced objective function for the optimization problem is proposed. Meanwhile, the solution for the optimization problem is also derived. Both the maximal margin criterion in each evolution window and global optimization of the whole evolutionary data stream are considered, the counterexamples are also fully utilized. The new method learns decision hyperplanes successfully. The experiments on the artificial and real datasets demonstrate the effectiveness of the method.3. Learning from group probabilities helps to protect the privacy of users and has become a hot topic in the community of machine learning. The traditional group probabilities based learning methods have gained certain success; however, they still fall short when the prior information is not fully provided. In order to solve this problem, a novel transfer learning method called transfer group probabilities based learning machine(TGPLM in abbreviation) is proposed by integrating group probabilities into the principle of structure risk minimization. In TGPLM, a novel learning criteria is proposed based on reusing the related domain knowledge by minimizing domain similarity distance, which makes the proposed TGPLM not only make full use of the group probabilities in the current scene, but also learn the existing useful knowledge in the history scene effectively. Experimental results on the artificial, UCI and PIE face datasets show the effectiveness of the proposed method.4. To address the problem of man-made information scarcity in the machine learning, a novel transfer learning machine based on group probabilities toward the common data, called TGPLM-CD, is proposed. The proposed method is based on structure risk minimization model, and considers both knowledge of source domain and class label group probabilities as well as the common data between source domain and target domain in the learning process, which realize knowledge transfer between source domain and target domain. Experiment results on extensive datasets show the effectiveness of the proposed method.5. The exact Labels of data are often unable to be known in the real world. To address the corresponding learning problem in the above scenarios, a novel transfer support vector machine for learning from data with uncertain labels(TSVM-UL) is proposed, the proposed method is based structure risk minimization model, and to learn considering knowledge of source domain and the common data between different domains, as well as labeled samples and probabilities of unlabeled sample of target domain, which realize knowledge transfer between source domain and target domain. Experiment results on PIE datasets and 20 Newsgroups datasets show the effectiveness of the proposed method.6. In areas like politics, improper content checking and disease diagnosis, only few labeled data and label proportions unlabeled observations may be known such that exact label information is protected, which results in a man-made scarcity of information. To address the corresponding learning problem in the above scenarios, a support vector machine with manifold regularization and partially labeling privacy protection, called SVM-MR&PLPP, is proposed. In SVM-MR&PLPP, a novel learning criteria is proposed by integrating the label proportions of unlabeled data into the manifold regularization framework, which improve the classification accuracy. Through transforming the optimization function, the complexity of the algorithm is reduced and a scalable support vector machine with manifold regularization and partially labeling privacy protection, called SSVM-MR&PLPP is proposed, which is efficient for large sample datasets. Experiment results on extensive datasets show the effectiveness of the proposed method.
Keywords/Search Tags:Evolutionary data streams, Support vector machine, Transfer Learning, Group Probabilities, Privacy Protection
PDF Full Text Request
Related items