Font Size: a A A

Research On Spam Review Filtering And Its Application In Review Management System Of Scratch Productions

Posted on:2022-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:F X ZhangFull Text:PDF
GTID:2518306338986219Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,the emergence of social platforms has not only facilitated communication and exchanges between users,but also caused the generation of advertisements,swear words and other bad information.Traditional machine learning methods or neural networks are effective in detecting spam review with significantly sensitive vocabulary.However,due to the complexity of Chinese characters and the randomness of writing,users usually use relatively irregular and insensitive variant vocabulary to replace sensitive words to avoid the detection of spam review.Variant words are usually a kind of metaphor and no longer have the meaning of Chinese characters on the surface of words,which will cause the failure of conventional detection methods.Therefore,how to accurately identify variant words and effectively detect and filter spam review becomes especially important.In response to the above problems and challenges,this thesis focuses on the research of Chinese short spam review filtering technology,and conducts in-depth research on Chinese text representation methods,spam review filtering methods,and text classification methods combined with contextual information.First,aiming at the characteristics of variants that are common in spam review,a spam review filtering model based on joint embedding of multiple text features is proposed.This model captures the main features of sensitive words from the perspective of pronunciation and glyphs so as to recognize multiple variant words.The model can correctly recognize spam review while reducing the misjudgment of normal review.Later,in order to further improve the filtering effect of spam review,a spam review filtering model combining user historical review and multi-text features is proposed,which use the review context information as additional information,and enriched the diversity of feature information.In order to verify the effectiveness of the two algorithms,this thesis conducted multiple sets of comparative experiments on the dataset of Scratch production reviews and the dataset of goods reviews.The results of the experiments fully proved the effectiveness of the model proposed in this thesis.Finally,based on the above models and methods,a review management system of Scratch productions is designed and implemented,and in conjunction with the review management mechanism,the emergence of spam review is effectively curbed,and provides users a positive and healthy learning atmosphere.
Keywords/Search Tags:spam review, text classification, contextual information, variant words
PDF Full Text Request
Related items