Font Size: a A A

Design And Implementation Of Micro-blog Data Mining Visualization System

Posted on:2018-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:J Y WangFull Text:PDF
GTID:2348330515978263Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the continuous improvement of the mobile communication network environment and the further popularization of smart phones,China's Internet has entered the era of Web2.0.As a typical representative of Web2.0,micro-blog has a large number of active users,covering a wide range of content,with huge social influence.Micro-blog has become an important channel for people to obtain information and share opinions,which contains great academic value behind the massive data.Therefore,this paper takes micro-blog as the research object,and studies the collection,mining,sentiment analysis and visualization of micro-blog data,designs and implements a data mining and visualization system based on micro-blog.The main work of this paper includes:(1)In the aspect of data acquisition,a micro-blog crawler system is designed and implemented.The system uses analog login to solve the problem of authentication.The idea of breadth first search is used to realize the self discovery of high quality users by using the hot micro-blog monitoring module.And combined with the network crawler,BeautifulSoup,regular expression,multi-threaded concurrency and database technology to achieve a variety of user information and micro-blog information collection.The crawler system has solved the problem that the acquisition of information is not comprehensive and request the microblog server over frequency,which achieves a comprehensive and efficient collection of micro-blog data.(2)In the aspect of data mining,this paper designs and implements the functional modules of micro-blog data mining,including the user analysis module and the micro-blog analysis module,which provide the basic functions of micro-blog analysis.This paper also focuses on the sentiment analysis of micro-blog text,which is based on the machine learning algorithms.And the classifer training experiment is designed and realized.This paper uses "individual word","double words","individual word combined with double words" three kinds of feature extraction models.The feature selection is realized by chi square method.And the Naive Bayesian,Logistic Regression,Support Vector Machine and other six classification algorithms are used to contrast experiments.The optimal classification model is obtained through repeated experiments.The model has a better classification result for both micro-blog text and short comment text.(3)In the aspect of data visualization,this paper uses histogram,line chart,map,tag cloud,pie chart,dashboard and other visual charts to display the data analysis results,which are presented through the browser.The system uses B/S structure.The front-end uses the browser to display the results of data analysis results.And the back-end is combined with micro-blog crawler,MySQL relational database,data mining module to achieve data collection,processing and analysis functions.Finally,the data mining and visualization analysis of micro-blog is realized.The main contributions and innovations of this paper are as follows:(1)A micro-blog analysis system is designed and implemented,which includes microblog data acquisition,data mining and data visualization.The system has realized the functions of user analysis and micro-blog analysis,which provides a basic platform for further research.(2)The system provides the sentiment analysis function of micro-blog text.In this paper,the machine learning algorithm is used to train the sentiment analysis model.The correct rate of the model is 85%,and the AUC value is 0.94.The system can directly use the classifier obtained to realize the sentiment analysis of micro-blog text.
Keywords/Search Tags:Micro-blog, Crawler, Data mining, Sentiment analysis, Machine learning
PDF Full Text Request
Related items