A Study On The Method Of Detecting Fakers For Online Forums

Posted on:2018-03-11

Degree:Master

Type:Thesis

Country:China

Candidate:P W Yin

Full Text:PDF

GTID:2428330590492281

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With Internet's impacting every aspect of people's lives,nowadays people can learn new things more easily than before,every moment information from the Internet affects people's understanding of the things.But these information is not always correct.In fact for the sake of certain interest huge amount of them are fake.Especially with development of Internet,posting fake information forms black industry,these fakers ruin the health of Internet continually,significantly affects people's correct understanding of things and sometimes it results into very bad social influence.So how to detect fake information and fakers is critical to maintain the health of Internet and even health of society.Recent years many researchers focus on how to detect spam reviews and fakers on micro blog,Twitter and Facebook,there is little on online forums.So this paper will be focusing on how to detect fakers on online forums.In order to detect fakers this paper uses two steps to detect fakers.In step one we use machine learning classification model to detect junior fakers.This model has very good performance,its accuracy is about 98.1%,recall is about 99.1% and precision is about 97.2%.And then in step two based on junior fakers and user network we use algorithms like PageRank to get three user ranks in order to detect more fakers,especially senior fakers.In our experiment among more than 5000 users we can detect 78 more fakers and among them 15 fakers are senior ones.Overall the performance is good.The method is firstly we analyze topic sentiment based on dictionary to get topic sentiment vector for every review.Based on topic sentiment vector we can get sentiment features like "Biggest positive/negative topic sentiment" and in step one of faker detection we use these sentiment features,basic user features and time window related features as the feature input of machine learning models.In step two when we calculate the user ranks on the user network we also use topic sentiment vector as the input of sentiment distance calculation.And then we use sentiment distance to get whether two users who post two reviews are support relation,contrary relation or neutral relation.Based on user relationship we can construct user support network,user contrary network and faker support network.And then we use algorithms like PageRank to calculate three user ranks on these three user networks.Finally we use K-Means cluster to analyze these three user ranks and detect more fakers,especially senior fakers.The method is based on many web mining technologies including word segmentation,sentiment classification,feature analysis,machine learning classification,PageRank and data cluster,etc.And with this method performance of detecting fakers is good.

Keywords/Search Tags:

data mining, opinion mining, sentiment classification, machine learning, PageRank, data cluster

PDF Full Text Request

Related items

1	Research On Website Public Opinion Analysis Platform Based On Click Stream Data Mining
2	Research On Sentiment Classification And Opinion Mining Technique Of Online Reviews
3	Research On Aspect-Level Opinion Mining Technology With Product Reviews
4	Research And Implementation Of Sentiment Classification And Opinion Mining In Micro-Blog Text
5	Opinion Mining In Online Forums For Financial Q & A System
6	Sentiment Classification By Combining Lexicon-based And Machine Learning Methods
7	Mining And Analysis Of Stock Market Public Opinion Data
8	Domain Adaptation Of Sentiment Classification
9	Research On Cross-domain Chinese Explanatory Opinion Mining Method Based On Transfer Learning
10	Research And Improved Of PageRank Algorithm In Web Data Mining