Font Size: a A A

A Research Of Micro-blog Comment Spam Recognition Method Integrated With Comment Network Graph

Posted on:2018-12-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y PanFull Text:PDF
GTID:2348330566951634Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the growing influence of the micro-blog platform,lots of meaningless comment spam with the property of advertising or malicious attacks flooding the platform,threatening the platform's stable and harmonious development.So,how to improve the overall comment spam recognition rate and reduce the normal/spam comment misclassification rate is the key content for the micro-blog platform in comment spam recognition based on existing studies.Under the establishment of the comment network graph model,considering the relevancy between the comment and the original blog,the text relevancy model is adopted instead of the traditional text similarity model to reduce the normal comment misclassification rate.Text relevancy also rely on word relevancy of the micro-blog corpus full text search library based on the Lucence search engine.When the text feature is not rich enough,users' friendliness and the commenter's credibility can be quantified through users' common attributes,users' interaction frequency and user mutual evaluation model.The higher the friendliness and the credibility,the lower probability to publish comment spam to each other to improve the accuracy of the spam recognition algorithm.To enhance the performance of the algorithm,the graph database is chosen to store and manage the comment network graph with various connections.The recognition results of each test will also be incrementally fed back to the comment network graph and text classifier,this incremental learning mechanism can further improve the overall recognition rate.Test results show that the recognition method proposed in this paper can achieve significantly optimization in improving the overall comment spam recognition rate and reducing the normal/spam comment misclassification rate,the computational time based on graph storage is also much less than that of relational storage.
Keywords/Search Tags:micro-blog, comment spam recognition, comment network graph, comment metadata, incremental learning, graph storage
PDF Full Text Request
Related items