Font Size: a A A

Research On Tianya BBS Posts Interfered With Water Army Based On SOM K-means Clustering

Posted on:2014-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:G Y ChenFull Text:PDF
GTID:2297330452456112Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The emergence and popularity of the Internet have brought great changes to people slife. Various types of network BBS have expanded the channel of information exchange andsharing. However, the involvement of online water army greatly reduced the authenticityand effectiveness of the network information, interfered the true public opinions, and eventriggered a network trust crisis. Therefore, to know about the characteristics of water armyand how to identify posts controlled by it is of great significance to solve the problem.This paper collected some attribute data from Tainya BBS hot posts, which wasregarded as one of the most active battlefields of water army. The data was pretreatedfirstly and three effective variables were extracted for clustering through correlation analysis,i.e. the registration date, the login times and the number of fans from the user s home pages.After that, self-organizing map (SOM) neural network was taken as a pretreatment to findout the most reasonable cluster number N and cluster centers, which were also treated as thecluster number and initial cluster centers for the next K-means clustering. Thus, thetwo-staged SOM K-means clustering was implemented and the cluster accuracy wasimproved. Then, the concentrated registration date, few fans and posts records, the obviousdata similarity of the variables and the abnormal regularity of user s ID name, was regardedas typical features of water army from the clustering analysis. Meanwhile, theconcentration of registration date, the data similarities of variables, the concentration ofregistration date of the first two pages of respondents and the naming rules of user s ID wereput forward to identify the posts interfered with water army in four aspects. The regularitywas proved effective to identify the water army through two random posts. Finally, thedeficiency of this paper and future research direction were pointed out.In conclusion, this paper was organized according to a logical process that dataacquisition, processing, analysis and the application. From a perspective of clustering, thefeatures of different BBS users were explored, especially the water army group, and aneffective method was provided to identify the posts interfered with water army.
Keywords/Search Tags:Water Army, BBS Posts, SOM—K-means, Clustering Analysis, Identify
PDF Full Text Request
Related items