Font Size: a A A

The Research Of The Classification Algorithm For Enterprise's Public Opinions Based On Huber-LMNN

Posted on:2017-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhuFull Text:PDF
GTID:2348330488459187Subject:Information Security and Electronic Commerce
Abstract/Summary:PDF Full Text Request
In the Internet environment, the public opinions about an enterprise will have a significant impact on the operations of the enterprise. So, the companies are very concerned about the development of the public opinions in the Internet. Because the public opinion data are large-scale and complex ones, the traditional classification algorithms often failed to achieve the desired classified results. To solve this problem, this paper studies the new classification technologies of enterprise's public opinions on the following two aspects:Firstly, the public opinion data are high-dimensional and nonlinear, hard to get theirs features. To solve the problem, the paper used a novel combinatorial kernel distance, instead of the Euclidean distance, to calculate the similarity of public opinion documents. Our method mapped the data into high dimensional feature space by the kernel function, which does not increase the computational complexity, and got the nonlinear features of public opinion data. The experimental results show that the proposed method has better classification results of public opinion events.Secondly, when the traditional LMNN algorithm is used to solve the large-scale text classification problems, there is a large-scale the semidefinite programming (SDP) problem in the LMNN. The SDP is hard to be solved effectively, because the size of the SDP problem in LMNN algorithm will expand rapidly as the data size. To solve this problem, this paper introduced the Huber loss function to divide the SDP model of LMNN algorithm into two smaller sub-optimization problems, which are much easy to be solved. The experimental results show that, compared with the traditional ones, the precision of the proposed algorithm improves 4.15%, and the classification time saves 47.10%. The proposed LMNN algorithm is more suitable for large-scale text classification than the traditional one.
Keywords/Search Tags:the classification of enterprise's public opinions, the extraction of text feature, LMNN algorithm, Huber loss function
PDF Full Text Request
Related items