Font Size: a A A

Research And Implementation Of Internet Forum Water Army Detection Based On User Behaviors

Posted on:2018-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:C LvFull Text:PDF
GTID:2348330515968626Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Internet forum water army is defined as the individual person or group of people who more frequently post and reply some hot topics at Internet forums than normal activities of Internet users in order to affect the public opinions.Nowadays,it is necessary to study on the behaviors of Internet water army since the activities of Internet water army are usually accompanied with rumor making,event truth covering,or even cause massive riots toward some events.So far,the techniques to detect and analyze Internet water army based on web page contents are still impractical.This Thesis designs a detection system based on user behaviors and classification methods.Firstly of all,with the help of focus crawler program based on breadth first searching,we "browse" relevant pages of forum after simulated login procedure,fetch and download forum data into database,secondly,we extract 8 features from these data which can definitely distinguish navy accounts from normal ones.After dividing sample into training set and test set,we mark both of them with a kind of auxiliary manual labeling method.A C4.5 Decision Tree classification model was then build on the labeled training set and used to make prediction about the test samples.Finally,we compare the prediction results respectively with results of pure manual judgement and auxiliary manual labeling method,result shows 74.00%and 89.49%in precision.The outcome proves the high resolution and effectiveness of extracted user behavior features.At the same time,we can make the conclusion that method proposed in this thesis can be used to detect Internet water army in forum.First part of this thesis is the introduction to Internet water army and some related studies on the detection of Internet water army.The second part introduces several key techniques used in this study simply.Then the method of forum data collection and method of structured storage are provided,including realization of simulated logging,design of focus crawler and database.The fourth part mainly analyzes some user features extraction methods from related research and then focus on method to the extraction of forum user behavior features based on our study object.The fifth part of this thesis design a Internet water army detection system based on user behavior and classification methods,which can be used to detect and recognize Internet water army in forum.Finally,a summary to this thesis and some future prospects on Internet water army detection are presented.
Keywords/Search Tags:Internet Water Army, Web Crawler, Feature Extraction, Classification
PDF Full Text Request
Related items