Font Size: a A A

Analyzing The Emotion Of The Readers Based On Multi-label News Corpus

Posted on:2017-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:X B PengFull Text:PDF
GTID:2308330485469646Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Web 2.0, a set of tools such as blog, micro-blog and WeChat have become the network platform for people to release and share information in their daily life. On the Internet, everybody can share his or her own opinions, which include lots of personal ideas, subjective emotion and emotional feedback.The emotional analysis, which based on massive international text information, is the important part in the area of public opinion analysis. It is necessary for sociological studies to discuss the dynamic trend of public mood contained in text information.This passage mainly discusses the emotional analysis problems from the standpoint of readers. In other words, by analyzing the words or topic in the news text, this passage tries to forecast readers’emotion after they reading the news. However more emphasis of the public emotion is put on the position of writers and little attention is paid to the readers. On the other hand, in related field, the analysis problem of text and emotion is usually dealt with as the single label problem admitted that one emotion of readers can only be aroused by one passage, which does not agree with the real one.In addition, most of the work related research based on bag of words model, according to the view of social psychology, reading emotions will not only associated with the news text in intuitive terms, but also the event theme implied in the news reports, which has an indirect relationship. For these problems, this paper carries out a systematic analysis of the text sentiment based on the multi label corpus which are tagged by the social public, and the main work is as follows:(1)The construction of multi-label news corpus. According to the research which focus on "multi label" and "reader", this paper considers the sentiment analysis task as a multi label classification problem and crawls data of Sina social news text which have Social Annotations, in addition, the paper also crawls the reader’s poll data and the data are then processed.(2)Using bag of words model and topic model, the data of multi label news data are tested, and the experimental results are analyzed. Data are processed from different angles, and the data set is modeled according to two-category, multi-classification and multi label classification and then we can get a model which can forecast the emotion of the reader. The experimental data show that on the classification performance of the topic model relatively better than the bag of words model, and the data also show that using the topic model, the dimension of feature vectors is far less than using the bag of words model. Research on the topic model has important practical significance.(3)Reference Hybrid Tag M-LDA method, label the emotion labels as known and the method is applied to text sentiment classification.The traditional LDA is an unsupervised topic model, which often needs to be combined with the classifier in the tag classification problem. In order to deal with the multi label news corpus and make full use of the class label of data set, the reference M-LDA is a kind of supervised topic model which has mixed the known classes. The M-LDA model considers the subject level in the model to mix the known classes and the implicit theme, and introduces the known classes information in the modeling process, finally it prints the labels that are sorted by their weight. Experiments show that the proposed M-LDA model exhibit good performancein both single label multi classification and multi label classification problems, especially for multi label reader sentiment classification, compared with the traditional methods, the M-LDA model is greatly improvedin the accuracy.
Keywords/Search Tags:socialmulti-label, topic model, LDA, sentiment analysis
PDF Full Text Request
Related items