Font Size: a A A

Design And Implementation Of Hot Word Discovery And Analysis System For Weibo

Posted on:2020-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:H G TangFull Text:PDF
GTID:2428330590484263Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Weibo has become an important platform for Internet users to share information and expand their interpersonal relationships,as well as a powerful tool for media to expand their news sources and influence.It has also become an effective channel for the relevant functional departments to understand the hot issues and public opinions.The data generated by Weibo has the characteristics of various types,large quantity and constant updating,so it is an almost impossible task to collect and analyze manually.This paper aims to develop a system to provide the functions of hot words discovery and sentiment classification from Weibo data.This paper firstly analyzes the user's needs in detail,on this basis,we designed a hot word discovery and analysis system,which mainly includes four parts: Data Import,Data Processing,Data Analysis,Visual Display,here's how it works:Data Import mainly realizes the function of data one-key import and template download.Data Processing mainly realizes the pre-processing function of importing system data,including data de-duplication,Chinese word segmentation,filtering stop words.In order to improve the filtering effect of online words,symbols and Emojis,this paper merges some popular online stop-words lists to improve the filtering effect of stop-words.The data analysis mainly realizes word cloud graph drawing,word frequency statistics and sentiment classification.Since the sentiment-classified Snow NLP Corpus is trained with commodity comment data,it is not effective to analyze the sentiment of Weibo content,so this paper collects some Weibo comment Corpus and trains them improved the accuracy of sentiment analysis.Visual display of the main implementation of the data visualization,respectively in the form of word cloud,table,Pie Chart Show word cloud,word frequency,emotional classification.The visual form automatically filters out a lot of low-frequency and low-quality text so that the user can get the gist of the text at a glance.In the system architecture,the system adopts a three-tier architecture,dividing the whole business application into: Presentation Layer,business logic layer and data access layer,achieving the goal of "high cohesion,low coupling" This architecture makes the system clear in structure,low coupling degree,high maintainability,high scalability,easy to adapt to changes in demand.
Keywords/Search Tags:WeiBo, Crawler, HotWord, Visualization
PDF Full Text Request
Related items