Font Size: a A A

Research On Application Of Bayesian In The Spam Filter Of Campus Message Board

Posted on:2017-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhangFull Text:PDF
GTID:2348330485483832Subject:Engineering
Abstract/Summary:PDF Full Text Request
The rapid development of Internet technology provides a strong support for the social network interactive platform, the secondary vocational school construction of digital campus network, but also attaches great importance to the establishment of network interaction platform. Due to the message board is a high degree of concern, the majority of parents and students concerned about school development, an important channel for feedback difficult problems, is an important auxiliary tool of school management, so the message board in the information security management can not be ignored.The campus network message board is in the school student to surf on the Internet frequently to release the information the platform, therefore guaranteed the campus net the message board network environment is pure, the audit trash message information becomes the campus network administrator's primary mission.According to the actual situation of secondary vocational schools, so that the network administrator with the least amount of simple and easy to detect and filter out spam is the main content of this paper. In this paper, the author of the secondary vocational school campus network as an example, the filter will filter the message board as a spam filter research objectives. Extract the message board of the school web page display and management of the background is shielded message as a sample, mark the page can be displayed as a legitimate message, the message is blocked in the background as a spam message.In the course of the study mainly to make the following priorities:(1) Using the naive Bayesian classification algorithm, the text classification technology is applied to the campus network message board spam filtering. The type of text is divided into two categories of legitimate messages, spam messages, calculating the probability of a message in different categories, select a larger category of probability, complete the classification of the message, to achieve spam filtering. Due to the small data samples, in order to better verify the effectiveness of the algorithm, using 3- fold cross validation method for naive Bayesian filtering experiments.(2) A filtering model based on the combination of rule and naive Bayes is proposed.Naive Bayesian classification results will appear some spam was missed sentence. In order to improve the recall rate of spam filter, the results of the Bias classification were proposed for two times. Through the artificial establishment garbage thesaurus, Bayesian classification results of legitimate messages and garbage thesaurus for string matching, in order to achieve the leakage judgment spam message correction.
Keywords/Search Tags:Message board, Text Filter, Naive Bayesian, Garbage dictionary, Regulation
PDF Full Text Request
Related items