Research On Spam Filtering In Adversarial Environments

Posted on:2016-06-20

Degree:Master

Type:Thesis

Country:China

Candidate:G X Wan

Full Text:PDF

GTID:2308330479493944

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development and the accelerated expansion of the Internet, email has become one of the most common used manners of information communication in everyday life. It makes our lives and work more conveniently. However the increasing spam has disturbed the normal communication and caused huge losses to the social economy. Itâ€™s becoming important to prevent the spread of spam effectively. Thanks to the development of artificial intelligence, varieties of machine learning approaches have been deployed on spam filtering. And it achieved good effects.Yet in adversarial environments, spammers analysis the shortages of classifiers and disguise their spam using a variety of strategies to reduce the accuracy of spam filters. Researches on classification in adversarial environments are called adversarial learning. Without losing readability, evasion attack which is one of the most popular attack strategies used by spammers hide spamâ€™s sensitive features to entice spam filter and reduce its classification efficiency by good word insertion and bad word deletion.This paper systematically analyzed the history and developments of spam, then summed up some related researches of spam filtering under adversarial attacks. Traditional TFIDF method use term frequency to represent the weights of features. But the weights of bad words drop too large to reduce the accuracy of classifiers under the good word attacks. So an improved SRTFIDF feature representation method is proposed to reduce the impact on the weights of features. Experimental results indicate that the robustness of enhanced feature representation method is better than traditional TFIDF method under the good word attacks.Compared with single classifier system, multiple classifier systems improve the classification accuracy and robustness of spam filters. However some researches show that traditional multiple classifier systems perform too badly in adversarial environments. So we proposed a partitioned multiple classifier system based on multiple instance learning in this paper. We split the whole feature space into two instances equally and treat several base classifiers for each instance to improve the robustness of spam filters. The proposed method are evaluated and analyzed experimentally on CEAS 2008 spam dataset. Finally experiments show that partitioned multiple classifier system performs better than traditional multiple classifier systems on the classification accuracy and robustness under the good word attacks and evasion attacks.

Keywords/Search Tags:

Spam Filtering, Adversarial Environment, Feature Representation, Multiple Classifier System, Robustness

PDF Full Text Request

Related items

1	Researches On Defense Strategy Against Evasion Attacks
2	Spam Filtering For Short Messages In Adversarial Environment
3	Research On Image Spam Filtering System
4	Study On The Application Of Causative Attacks In Spam Filtering
5	Adversarial Classification For Email Spam Filtering
6	Research On Spam Filtering Technology Based On Multi-Modal Features
7	Design And Implementation Of Content Based Spam Filtering System
8	Design And Lmplementation Of The Spam Filtering System Based On VSTO
9	Research And Implementation Of Spam Filtering System Based On Improved Bayesian Algorithm
10	Svm-based Spam Filtering