Font Size: a A A

Research On Multi-label Dimensionality Reduction And Classification Algorithms

Posted on:2015-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2298330467985592Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of Internet and information technology, a huge amount of multi-label data appears in the Internet. In multi-label data, each instance belongs to several labels at the same time. How to process multi-label data effectively and efficiently has been a hot research point these years.In multi-label learning, most of the attention has been paid to the research of multi-label classification algorithms. While in this paper, considering that multi-label data with high dimensionality is difficult to learn and it might encounter the ’curse of dimensionality’ issue, a multi-label dimensionality reduction algorithm——Multi-label Kernel Discriminant Analysis MLKDA is proposed to reduce the dimensionality of multi-label data. In the classification step, with the algorithm adaption method, the ELM (Extreme Learning Machine) algorithm is adapted for the multi-label classification and the multi-label ELM can achieve efficient classification of multi-label data.The dimensionality reduction is as a step of the data preprocessing process in multi-label learning. Generally, data points in the high dimensional space may suffer from linearly inseparable problem and some multi-label dimensionality reduction methods cannot deal with the nonlinear dimensionality reduction. Besides, some methods don’t consider the multi-label structure as a whole, thus it may destroy the integrity of the multi-label data. To deal with these issues, MLKDA algorithm maps the data features with kernel trick to deal with the linearly inseparable problem and treats the label set as a whole considering the label inner associations to maintain the label integrity. MLKDA tries to reduce the dimensionality and reserve discriminant information as much as possible at the same time, which can solve the curse of dimensionality and facilitate the classification procedure as well.Multi-label classification is the target of multi-label learning. The pre-existing multi-label classification methods can be categorized to problem transformation and algorithm adaption methods. Generally, the efficiency of the algorithm is an important issue, while most of the problem transformation methods may suffer from low efficiency and poor scalability problems. In this paper, in order to achieve relative fast and accurate classification, it adopts algorithm adaption methods to adapt ELM algorithm for the multi-label classification.Moreover, considering the scalability of algorithm, I combine the MLKDA and multi-label ELM algorithm and extend it to the multi-label data stream experiment. The improved algorithm solves the small sample issue which may occur when achieving dimensionality reduction in the partitioned data blocks of a data flow. Besides, it can realize fast classification of multi-label data stream.The MLKDA and multi-label ELM algorithm can achieve a good learning of multi-label high dimensional data. The experiment results in frequently-used multi-label datasets show that MLKDA algorithm in this paper outperforms existing dimensionality reduction algorithms in most cases and the adaption of ELM for multi-label classification is a good choice. The experiment in multi-label data streams illustrates the scalability of the proposed algorithm.
Keywords/Search Tags:Multi-label, Dimensionality Reduction, Kernel Function, Multi-labelClassification
PDF Full Text Request
Related items