Font Size: a A A

Detection Research Of Online Water Army Based On Clustering Algorithm

Posted on:2019-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:2428330548987821Subject:Engineering
Abstract/Summary:PDF Full Text Request
For today's society,micro-blog is not a stranger word any more.From the beginning of Tencent blog to the layman's sina micro-blog,it appears to all aspects of people's life has brought great changes.For today's information age,micro-blog can help us to obtain more information effectively,let's do never leave home can know what's going on in the world,many hot news and information can be learned through micro-blog,we can also optionally special know what they want to pay attention to,the current micro-blog data traffic has became the largest social networking platform,users are increasing.The character of objective things,meanwhile,also suggests that the amount of information huge potential negative factors in micro-blog,we usually feel boring,annoying heat to the line of sight,personal accounts,people call it water army.The definition of the network water army is to make a similar assessment of certain content in the network,not to express personal opinions,and to push for commission.Their most striking feature is the similarity in the content of the comments,the number of users concerned,and the number of fans that individuals have.We have more of these water army,referred to as water army hired or network,these so-called water army in general often active in some e-commerce site,size,BBS and social networking platforms such as micro-blog.They would pretend to be normal or Internet users,Internet users by publishing false news,to make similar patterns of response to a certain point of view,to spread the bad post behavior such as good web experience affect ordinary ordinary.Internet users.In today's social network is more and more developed,from the perspective of network security,to find and control water army,to maintain Internet and social networking site safety,restore natural information network,they can network justice,put an end to the network violence meaning immensely,has profound practical significance.The purpose of this paper is to study how to find these conditions,the basic idea is through hundreds of thousands of clutter in micro-blog comments to find similar comments,with the similar comments extraction and related statistical data of network users,coupled with its location information,published in the blog content,pay attention to the ratio of the number and the number of fans,find water army,through the analysis of the algorithm.This process is divided into four parts,including extracting data and analyzing data.This process will be reflected in chapters 3 and 5 of this article: The first part deals with the selection of user relationship characteristics and the definition of each characteristic value.The second part involves the method of data extraction,which is embodied by the simulation of the micro-blog.The third part is the extraction of user characteristic value,training and processing method of data.The fourth part is based on the clustering of micro-blog water army identification algorithm,which mainly introduces the selection of value and the introduction of the algorithm used,and the feasibility is judged by the experimental analysis.
Keywords/Search Tags:Network water army, Sina micro-blog, User characteristic value, Clustering algorithm, SVM algorithm
PDF Full Text Request
Related items