Multi-source Public Sentiment Monitoring System For Enterprise:Research And Implementation

Posted on:2015-01-13

Degree:Master

Type:Thesis

Country:China

Candidate:Y Fang

Full Text:PDF

GTID:2308330464458066

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Nowadays, Internet plays the most significant method for people to retrieve information and express their opinions. With the overwhelming growth of social-based and comprehensive information platform, such as Weibo and wechat, public sentiment sources have become more and more sophisticated, more social, more instantaneous and more propagating. There is a strong need for enterprises to fully monitor online news, blogs, tweets, comments in forums, video sites, to maintain public image and deal with emergency opportunely. Public sentiment mining systems, based on multi-source web page collecting, intellectually analyzing, and fully monitoring is crucial.Based on the requirement of the background project, this thesis does thorough research on crucial phases in public sentiment mining system, including web content extracting, web page preprocessing, sentiment page detection, new event detection, topic detection and tracking, sentiment analysis. Vital problems are addressed, such as low accuracy for traditional web crawlers to extract pages from hidden web, high false positive rate for vector space model during new event detection, low precision rate in sentiment analysis, and new algorithms and strategies are proposed. A public sentiment mining system that fully compliance with requirements is implemented, and tested.In this thesis, AjaxCrawler is proposed. Itâ€™s based on dynamic script execution and reconstructing DOM context. Instead of building a navigation path followed by hyperlinks, AjaxCrawler builds a DOM state transfer map linked by events on particular DOM nodes. DOM node distance is proposed as the h-score function to heuristically speed up the search process. By replaying events on the shortest path, hidden content can be extracted. Case studies in extracting price tables and comments on B2C web sites show that AjaxCrawler has very higher precision and better performance than traditional crawlers. A new strategy based on pre-classifying and named entity recognition is proposed for new event detection. First pre-classify web contents to ten classes and only documents from the same class are fully compared. Weighted named entity similarity is proposed for measuring document distance. Cases study shows that this method improves the precision and recall of new event detection. An improved method based on segmenting and weighted sentiment appraisement is applied to increase the precision rate in sentiment analysis.In the background project of this thesis, algorithms and systems implemented in this thesis are fully functional tested and trail used. Cases studies show that this system has good precision rate and performance, and is fully applicable in real world practice.

Keywords/Search Tags:

Network Monitoring, Sentiment analysis, Information Retrieval

PDF Full Text Request

Related items

1	Sentiment Analysis On Entity Search Results
2	The Study On Financial Information Retrieval Oriented Genre Classification And Sentiment Analysis
3	Analysis And Application Of Sentiment For Network Users
4	Research On Complex Sentiment Analysis Method Of Network Discourse Based On Multimodal Sentiment Computing
5	Multimodal Sentiment Analysis Based On Deep Learning
6	Information Retrieval Oriented Analysis Of Text Content
7	Research On Hot Character And Event Analysis Techniques Oriented To Public Sentiment Monitoring
8	Sentiment Analysis Of Popular Events Based On Chinese Microblog Network
9	Sensitive Information Identification Based On Sentiment Analysis Of User Original Content
10	Research Of Online Financial Information Sentiment Analysis And Its Relationship With The Stock Market Volatility