Parallel Multi-label Evolutionary Hyper-network On Spark

Posted on:2018-10-03

Degree:Master

Type:Thesis

Country:China

Candidate:R Zhao

Full Text:PDF

GTID:2348330569986444

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In recent years,the multi-label learning gets more attentions from many fields,such as image recognition and text classification,and has increasingly important potential application value in the real world.In multi-label learning,each object is associated with a set of class labels.So the key challenge of multi-label learning is the exponentian prediction label space,existing multi-label learning approaches mainly focus on the improvement of learning processes by using label correlations.Nevertheless,an intrinsic characteristic of learning from multi-label data,i.e.class-imbalance among labels,has not been well investigated.Besides,most multi-label learning algorithms do not work very well in dealing with large-scale data sets.In the multi-label evolution hyper-network,hyper-edge and its corresponding weight represent high-order relationships between feature subsets and multiple class labels,which could be effectively used as mining of label correlations.In this thesis,based on the multi-label evolution hyper-network the improved algorithm is proposed,which deals with label correlations and class-imbalance using Spark's distributed parallel computing framework for large-scale data processing.The main research work of the thesis is shown as follows:1.In order to deal with label correlations and class-imbalance,this thesis proposes a modified multi-label evolution hyper-network based on Spark.Firstly,the model converts the traditional hyper-network into a multi-label hyper-network.Secondly,cost-sensitive strategy is introduced into the multi-label evolution hyper-network for addressing the problem of class-imbalance.Meanwhile,the replacement of hyper-edges and the gradient evolution learning process is optimized to reduce the time complexity and improve the performance.Finally,we improve the adaptability of the algorithm to large-scale data sets by implementing parallel computing framework under Spark platform.2.In order to further improve the performance of the proposed algorithm on large-scale data sets,an improved multi-label evolutionary hyper-network ensemble algorithm based on Spark is proposed,which combine hyper-network structure and ensemble learning.Firstly,we construct a training cluster with similar feature spaces using Self-Organizing Map.Secondly,with respect to each training cluster,we use theproposed improved multi-label evolutionary hyper-network algorithm based on Spark to form a number of local multi-label hyper-networks.Finally,the local hyper-networks are transformed to a new hyper-network using selective ensemble learning method for predicting the testing samples.In this thesis,comprehensive experiments are conducted to verify the effectiveness and superiority of the proposed algorithm on 12 multi-label datasets.On the one hand,the effectiveness of the proposed algorithm is verified by comparing the prediction performance between the proposed algorithm and the state-of-the-arts algorithms,such as Co-MLHN.On the other hand,by analyzing the efficiency,the proposed algorithm has lower time complexity,good parallelism and scalability.

Keywords/Search Tags:

multi-label learning, evolution hyper-network, label correlations, Spark, ensemble

PDF Full Text Request

Related items

1	Research Of Calibrated Label Ranking Multi-label Algorithm Based On Spark
2	Text Categorization Of High Dimensional Imbalanced Data Based On Depth Label Correlation Mining
3	Research On Multi-label Classification Algorithm With Label Correlations
4	Research On Multi-label Learning And Algorithms Based On Data And Label Correlations
5	Research On Multi-Label Learning Based On Label-Specific Features And Label Correlations
6	Research On Multi-label Learning Algorithms Based On Samples And Label Correlations
7	Multi-Instance Multi-Label Learning Based On Neighborhood Consensus
8	Multi-label Learning Based On Transfer Learning And Label Correlation
9	Research On Multi-label Feature Selection Algorithm Based On Sparse Learning
10	Missing Multi-label Learning For Label Semantic Space Mining