Font Size: a A A

Research On Key Techniques Of User-oriented Text Sentiment Analysis

Posted on:2018-04-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:X J PuFull Text:PDF
GTID:1368330512998134Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of internet technology and social media,people are used to share the opinions in the Web.The huge amount of growing opinionated data provides a wealth of sentiment information.To analyze and mine these information is of great advantage to understand public feelings,identify social demands and predict event trends.Sentiment analysis is the emerging research area to automatically analyze these information and with great value for society and economy.To analyze these massive online reviews has drawn much attention from the re-searchers nowadays,however,the opinionated contents are usually generated from various domains,with a variety of forms,dynamic and with personalized expression styles,which all bring great challenges to the research.The existing works of sentiment analysis mainly focus on general techniques,however,the studies of domain-specificity and user personalization of opinions are still insufficient.In this thesis,regarding the two perspectives,we make great efforts to incorporate various factors,e.g.users,user expressing habits and domain-specificity of opinion targets,throughout the whole anal-ysis process to improve the performance of user-oriented sentiment analysis.The main works of the thesis are summarized as follows:1.A novel method for topic modeling from keywords with compressive sens-ing,which improves the efficiency and accuracy of domain topic modeling for user needs.The users are usually interested in different domains,and the sentiment expres-sions are also domain-specific,thus a general approach is to design analysis methods towards each domain respectively.To satisfy user needs of data for sentiment analysis of special domains and reduce the description costs for domain contents,we propose an effective topic modeling approach for modeling domain topic(Topic Distilling with Compressive Sensing).In our method,with the user provided domain relevant key-words,the topic could be precisely reconstructed with compressive sensing,which exploits and utilizes the sparsity of the topic in semantic space.Moreover,an iterative learning method is utilized to make the result more robust.With our method,a more accurate domain topic representation in the semantic space implied by the keywords is obtained.With this topic representation,the domain texts can be filtered,which can effectively support the subsequent sentiment analysis tasks.Our method is simple yet effective,and the framework is compatible with other text semantic representation models,with remarkable flexibility and adaptability.2.An improved smoothed CRF model for entity recognition,which improves the accuracy of opinion target extraction.Opinion target extraction is essential to accurately analyze the sentiment of opinionated contents.The opinion targets usually appear as entities and can be extracted with sequence labeling methods,of which CRF is commonly used.However,the generalization of CRF will decrease while the labeled training data is not sufficient,or the training and testing data is different in distribution and domain.Towards better generalization of CRF,we introduce the smoothing fea-tures for CRF,which improve domain adaptation capability of CRF and increase the effectiveness of entity recognition,especially the recall.Moreover,CRF is not good at capturing the long distance semantic relationship,which also leads to the difficulty in accurately recognizing some opinion targets.We fully exploit the context and syntax position features of the opinion targets,and then combine CRF with syntactical rules for opinion target recognition.Our method can sufficiently take advantage of the high precision of CRF and remedy the limitation in low recall.3.A novel method for document level sentiment analysis with the exploration of overall opinion sentences,which improves the performance of document sen-timent classification.Based on linguistic habits,when users express opinion towards a target,some of the opinion sentences are towards the whole target,while some oth-ers are towards detailed aspects.The sentiments of these sentences are also different.Existing methods for sentiment classification usually teat all the opinion sentences equally,which is hard to make right prediction when the sentiments of most aspect opinion sentences are not coherent with the overall sentiment.With the exploration of linguistic habits for opinion expressions,we propose a novel method called SVMetop extended from structural SVM for sentiment classification.Our method can recog-nize the overall opinions correctly,and increase their impacts in determining document sentiment,which improves accuracy of document sentiment classification.The exper-iment results demonstrate the effectiveness and efficiency of our method.4.A user aware topic model for better modeling the fine-grained sentiment and topics of online reviews,which further enhances the performance of senti-ment analysis.The topics and sentiment preference of the users are usually different,as well as the expressing style of opinions.To better analyze the sentiment and topics in the reviews,all the three factors,i.e.users,reviews and items,should be carefully synthesized.We propose a novel topic modeling approach which exploits the impact of three factors simultaneously.Our method provides a unified modeling framework.With our model,more accurate and coherent topics can be extracted.Moreover,the user-topic and item-topic distributions are also generated.Then user interest and pref-erence can be obtained effectively,which is beneficial for personalized services.In this thesis,several key techniques of sentiment analysis are systematically stud-ied.Based on the proposed methods,on the cloud computing platform,we analyze and mine the massive user behavior data,including attribute extraction from the web pages,user interest analysis.The results after processing are with great practical values for online advertising and recommendation systems.The experiment results demonstrate that the technical solutions proposed in the thesis are effective and efficient.
Keywords/Search Tags:sentiment analysis, opinion targets, sentiment classification, domain modeling, semantic representation, compressive sensing, information extraction, entity recognition, conditional random fields, structural learning, overall opinions, topic models
PDF Full Text Request
Related items