Font Size: a A A

Research On Subjective And Objective Classification Of Short Text Based On Ensemble Learning

Posted on:2017-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:J TaoFull Text:PDF
GTID:2348330488998058Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the further development of information technology, people need to develop new techniques to acquire useful knowledge from large text within a short period of time. Therefore, the corresponding information extraction technology has been rapidly developed, such as search engine, automatic summarization, opinion mining, opinion sentence extraction. Subjective and objective classification, as a basic problem in text classification, has been paid more attention by researchers. The traditional subjective and objective classification using machine learning methods for training and classification, but due to the inherent complexity of Chinese texts, The performance has encountered a bottleneck. So, we bring forward an idea to further enhance the performance by using ensemble learning algorithm which can significantly enhance the generalization ability of classifier in classification problems. At the same time, the accuracy and stability of classification can also be improved.This paper resolve subjective and objective classification by using ensemble learning, We first introduces the traditional subjective and objective classification method, and theoretical knowledge of ensemble learning. Then, construct the ensemble classifier according to the basic features of the subjective and objective texts. The main work of this paper are summarized as follows:(1) We first collect subjective clues features, then introduces the concept of subjective clues density and describe the calculation method of the subjective clue density. On this basis, the texts are divided into different areas. We use the Naive Bayes classifier to perform the classification. Finally, Algorithm of using Bagging method to integrate the classifiers is proposed. Experimental results show that the method of subjective and objective classification based on subjective clues has certain effect and perform better in an ensemble environment. At the same time, we find that this classification has good adaptability to the new test samples.(2) We combine variety of features together, such as the text of the word, part of speech, semantic dependency. Then we sort all kinds of features by the calculated CHI values and determine the optimal feature dimension. In the fusion experiment, we try features of a variety of combinations and ultimately determine the best. Due to the complexity and variety of the classification, we introduce clustering ensemble learning into the classification and propose a classification method based on clustering dynamic integration which can strengthen the selection process of the base classifier in the specific region of text set. Experimental data show that the classification effect is better than traditional methods especially in the accuracy.
Keywords/Search Tags:subjective and objective classification, ensemble learning, dynamic ensemble, Bagging
PDF Full Text Request
Related items