Font Size: a A A

A Collaborative Filtering Algorithm Based On The Bayes Classification Of Email Networks

Posted on:2015-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:S OuFull Text:PDF
GTID:2348330518470247Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
There are a lot of junk mails . The junk mails fill-up mail server storage space and users have to spend much time on removing these mails. At present, the study of the filter of the Chinese spam are in full swing. With the development of text classification techniques,content-based spam is an effective method for spam. Bayes classification has been widely used and achieved classification result in the text classification. But there have been two major problems, the algorithm's classification accuracy is mainly restricted. Firstly because bayes classification are mostly only considering the individual, without taking into account the combination between users. Secondly bayes classification must first fully trained then will show good performance. Fully trained means that users need long-term involvement and frequent feedback.In order to alleviate these contradictions and improve the accuracy of spam filtering,this paper tries to analyze the propagation characteristics of spam from a holistic point of view.Social network analysis in Students corpus found that the real e-mail network was a small world in some degree. Secondly,this paper construct student email interaction graph,user interaction strength between nodes based on defining the interaction strength, W matrix, and then according to the interaction strength between matrix W method for computing, new user interaction strength is proposed,called the node interaction probability. There is a difference between the mail sending and receiving. Lastly, a spam collaborative filtering method was designed based on users, interaction. By adjusting the parameters ?, users can decide filtering spam by themselves or others or trade-off between them. The algorithm even in the absence of users' personal collection and full training, can achieve good filtering.The real experimental data sets show that compared with the single filter method, the collaborative filtering method is simple and he recall rate of R, correct rate of P ,the accurate rate of Auc three evaluation indexes are improved.
Keywords/Search Tags:Bayes classification, spam filtering, e-mail network, collaborative filtering
PDF Full Text Request
Related items