Font Size: a A A

The User Comment Analysis System Based On Distributed Web Crawling

Posted on:2019-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:X LuoFull Text:PDF
GTID:2428330590995930Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of mobile Internet and growth of network bandwidth,many people can enjoy multimedia,which enrich their spiritual life.Since the main area development of video have changed from the PC to mobile,interactive mode between people and videos has great changes.This work mainly discuss the research about the user review from analysis and acquisition.Due to the exponential growth of the number of bullet screen appearing on the web site every day and the current user review data collection does not have a complete distributed crawler system architecture,this work build a distributed crawling system.Firstly,according to the characteristics of massive data,the control spider and log module are designed.Log module is built for debugging system and restoring the data and control spider schedule tasks and monitor whole system.Then,according to the needs of data collection,the drama spider,episode spider,and review spider are designed respectively to lay a solid foundation for the user review classification.In addition,this work further implements the user review analysis system for sentiment classification of user review.This system apply sentiment classification of user review by analyzing,data preprocessing,keyword extraction and word2 vec.In data preprocessing,this work split user review by jieba and filter the stop words.The subject terms of the user reviews are then extracted by TF-IDF to display the core topics of each review.The classification of the user review is then achieved by word2 vec.The word2 vec fristly map the review into vector by embedding layer and build classification layers which contains a flatten layer,a hidden layer,and an output layer.According to the experimental results,the method used in this model can effectively learn the feature information in user reviews,and thus play a good classification role for sentiment classification.The accuracy of the traditional system is improved by at least 7.11% and reached 82.79%.
Keywords/Search Tags:neural network, user review, Tensorflow, TF-IDF, spider, natural language processing
PDF Full Text Request
Related items