Font Size: a A A

The Research Of Data Classification Method Based On Multi-Task Learning

Posted on:2019-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:J Y MaFull Text:PDF
GTID:2428330545952594Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the context of big data application,the data analysis and mining face great challenges with information generation channels and the way of information expression increasing rapidly.In the field of machine learning,when dealing with data classification problems,the corresponding classification model should be established for different data sets,namely single task learning.Due to the single-task learning has the limitation that couldn't take advantage of information,leading to low classification accuracy.Therefore,mining the correlation of the multiple training tasks to improve the model generalization ability of multi-task learning has become the current hot topic.However,the existing classification method in extracting relationship of multiple tasks still has great defects,namely ignoring the correlation between multiple tasks with isolated tasks,and couldn't combine with other classification techniques effectively,so that the classification accuracy meets the bottleneck.In this paper,based on the current data classification and multi-task learning technology,we follow the goal of improve data classification accuracy and carry on the research on the problems existing through using the relevant methods in the field of machine learning,and obtained the certain innovation.The main research work of this paper is as follows:Firstly,Aiming at the problem that the traditional multi-task learning model does not adequately extract the correlation between tasks and focuses on a single extraction level,this paper proposed a multi-task learning model with sparse induction.Based on conventional study and combining with the different characteristics of the combination between the prediction principle of consistency,we constructed a multi-task learning model in task level using L2,1 norm which can realize sparse induction for specific tasks,and shared characteristics between multiple related tasks.Secondly,in view of the MTMVC + learning model proposed in this paper,owing to the existence of three non-smooth regularization,namely the objective function is a convex function,therefore,it is not able to use the traditional algorithm for convex programming problem.To this end,this paper proposed and implemented an alternately iterative optimization algorithm,through fixing the variables in the target function and solving the other until convergence of objective function.In addition,the convergence of the algorithm is proved in detail,and the effectiveness of the algorithm is demonstrated by experiments in 20Newsgroup dataset.Finally,this paper compared the sparse-induction MTMVC + model with classic models in multi-task learning areas on WebKB,NUS-WIDE Object and Multi-feature digit experiments benchmark datasets.The experimental results show that the MTMVC+ classification model by extracting correlation between multiple tasks can effectively improve data classification accuracy,and improvement rate was between 7%and 9%.
Keywords/Search Tags:Task relationship, Multi-task learning, Alternately iterative optimization, Regularization
PDF Full Text Request
Related items