Font Size: a A A

Research On Technologies Of Microblog Sentiment Classification With Topic Self-adaptation And Opinion Summarization

Posted on:2016-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:F HuFull Text:PDF
GTID:2428330545486562Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With constant development of microblog platform,plenty of topics are generated every day,which reflect the popular events happened in the realistic society and focus of public attention.Microblogs under a specific topic embody concentratedly users' views towards this topic.In order to get the information of public opinion,this thesis takes the microblog topic as the research object,respectively research the sentiment classification and opinion summarization this two aspects of microblog.Microblog has various topics but lacks emotional annotation training set,and a topic classifier trained by certain topic has poor adaptability.Therefore,this thesis put forward a framework of sentiment classification of microblog with topic self-adaptation,which aims to use topic data with annotated training set to classify target topic data of unannotated training set.Firstly,LDA is used to conduct topic modeling on topics to distinguish the similarity of topic distribution between topics,so as to determine the source topic applied to the target topic.Then the SFA algorithm and potential feature space are introduced to reduce the mismatch between different features within topics.In addition,according to the characteristics of microblog data,the non-textual features are added to enhance the match level between features within different topics and get the final feature combination.Finally,the classifier trained by the source topic is used to conduct sentiment classification on the target topic.Microblog expression on the same microblog topic has different emphasis,and thus different microblog clusters are formed.By analyzing the related level of opinion words in the same cluster and between different clusters,this thesis proposes the opinion summarization technology based on overlapping community detection.First of all,opinion words are extracted and PMI values between these words are calculated,based on which to build a network of opinion words.Then SLPA is used to find opinion word overlapping communities,after that classify microblogs and form the microblog clusters.Finally,Hybrid TF-IDF algorithm is used to extract the typical opinions based on microblog clusters.Experiments are conducted on real dataset for above two parts,and results show that the proposed method in this thesis can effectively conduct sentiment classification on the target topic,and the extracted typical opinions can reflect different focus of topic.
Keywords/Search Tags:sentiment classification, opinion summarization, LDA, topic distribution, overlapping communities
PDF Full Text Request
Related items