Font Size: a A A

Group Sparsity Based Subset Selection Applications

Posted on:2020-09-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Q YaoFull Text:PDF
GTID:1368330575966579Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Extracting useful information from large-scale data is a major challenge for artifi-cial intelligence.As an effective means of information filtering and data summarization,the subset selection method selects a subset of the most informative data from large-scale data to represent the entire data set,which can reduce the scale of the data to be processed.In addition,the subset selection method is also used to enhance the model ca-pacity in related fields and improve their generalization performance.This dissertation mainly studies the applications of group sparsity based subset selection method in mul-tiple kernel learning and multi-task learning.In multiple kernel learning,representative kernels are used to reduce the redundancy of different similarity measures and informa-tion from different data sources,respectively.In multi-tasking learning,representative tasks are utilized to fully explore the underlying cluster structure of tasks.First,this dissertation proposes an efficient multiple kernel clustering method to enhance the diversity between the pre-specified kernels by selecting representative ker-nel.Specifically,we first design a strategy to select a representative subset from the pre-specified kernels,and then incorporate this representative kernel selection strategy into the objective function of multiple kernel clustering,and finally design an alter-nating optimization method to optimize the cluster membership and the combination coefficients of kernels.In particular,a customized optimization method is developed to reduce the time complexity of optimizing kernel weights with resort to the alternating direction method of multipliers framework.Experimental results on several benchmark and real-world data sets validate the effectiveness of the proposed method.In compari-son with existing methods,the superior performance obtained by the proposed method shows that the regularization induced by the representative kernel selection can effec-tively improve the quality of the combined kernel.Then,based on the non-negative matrix factorization,this dissertation proposes a novel data fusion method to integrate representation information from different data sources and obtain data representation of high quality under the framework of multiple kernel learning.Instead of combining multiple kernel matrices corresponding to infor-mation from different data sources directly in a convex manner,we introduce a regu-larization term in the objective function to characterize the similarity between pairwise kernel matrices and reduce the redundancy existed in the information from different data sources.Interestingly,the resulting objective function can be viewed as a variant of the representative kernel selection.Next,an optimization method based on the the al-ternating direction method of multipliers is designed to solve the objective function.We evaluate the proposed method on the face recognition task,and the experimental results on three data sets demonstrate the advantage of the diversified data fusion method.Finally,based on the assumption that each task in multi-task learning can be repre-sented by a linear combination of some representative tasks,this dissertation proposes a robust task grouping method for clustered multi-task learning by selecting represen-tative tasks.To be specific,the proposed method uncovers the underlying cluster struc-ture of tasks by selecting representative tasks that share the most information with other tasks.Based on the corresponding representative tasks,the related tasks are divided into different groups,such that information can be shared to some extent between tasks within the group.In addition,the robust loss function is used to measure the error between each task and its reconstruction by the representative tasks,which could effec-tively reduce the impact from outlier tasks.Experimental results on both synthetic and real-world data sets demonstrate the effectiveness of the proposed method in compari-son with existing multi-task learning methods.
Keywords/Search Tags:Representative Subset Selection, Group Sparsity, Multiple Kernel Learn-ing, Multi-Task Learning, Machine Learning
PDF Full Text Request
Related items