Font Size: a A A

Realization On Multi-label Text Classification Algorithm Based On SSPP-KELM

Posted on:2018-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:B ShiFull Text:PDF
GTID:2348330518482356Subject:Computer technology
Abstract/Summary:PDF Full Text Request
According to the number of class labels, the text data classification can be divided into single-label text classification and multi-label text classification. If there is only one label, it is called the single-label text classification, otherwise the multi-label text classification. The classification of multi-label text is common in practical application.Currently researchers focus on dimensionality reduction and classification algorithm of multi-label text data. However, in the existing dimension reduction algorithm of multi-label text, some algorithms have little effect on the promotion of classification effect, and time efficiency is relatively low. At the same time, the problem of multi-label text classification algorithm is significant. For example, the association among labels isn't considered in the classification process, and the accuracy of classification is not ideal. This paper do the research of the following two aspects by analyzing the above problems and combine the existing dimensionality reduction and classification algorithm of multi-label text:First,in the data preprocessing stage,we propose a Supervised Sparsity Preserving Projections (SSPP) with tag information based on SPP, used to reduce the dimension of the data and map the high-dimensional data to the low-dimensional space. The dimension reduction processing of data is an important step in the process of multi-label text learning. In this paper, we use the kernel function of SSPP method to deduce the feature space of the data so that the dimension of the nonlinear sample could be reduced. In the process of dimensionality reduction, considering that the multi-label text data carries a large amount of tag information, we assimilate these tag information to the dimensionality reduction method thus the original unsupervised method becomes a supervised one. Therefore the problem that the dimensionality reduction algorithm cannot use the supervisory information of the sample data is solved.Secondly, introducing Extreme Learning Machine (ELM) to classify the data after dimension reduction. Assimilating the kernel function into Extreme Learning Machine could not only ensure the time efficiency, also improve the classification accuracy of the algorithm. Furthermore, kernel Extreme Learning Machine can classify the multi-label text data sets directly by corresponding the multi-label class to the output nodes of the Extreme Learning Machine algorithm. The completeness of the sample label set is guaranteed in the general algorithm classification. Finally, this paper proposes to combine the SSPP with the kernel Extreme Learning Machine to classify the multi-label text data sets.
Keywords/Search Tags:multi-label, text classification, dimensionality reduction, ELM, SPP
PDF Full Text Request
Related items