Spam Filtering For Short Messages In Adversarial Environment

Posted on:2016-11-22

Degree:Master

Type:Thesis

Country:China

Candidate:C Yang

Full Text:PDF

GTID:2308330479993944

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development and popularity of technology, mobile phones and online platforms(such as email, blog, forum, etc.) have become important means of daily communication. However, increasing criminals spread spam message including advertising, pornography, fraud, superstition and so on via such communication tools because of the low cost, which annoys the users seriously. Different from the email, a short message only has a few words and its length usually has an upper limit, e.g. the traditional SMS message is limited to 160 characters. Therefor their text is rife with idioms and abbreviations, which may deteriorate the performance of traditional classifier in short messages spam filtering. There are some studies about improving the ability of classifier to identify SMS spam in the past years. However, spam filtering technique for short messages under adversarial environment where the efficiency of a classifier is downgraded due to the manipulation of samples made by an adversary has not been investigated.Attacker can revise the feature value of a malicious sample so that it could evade the detection of classifier under adversarial environment. For example, they can insert some good words into a spam to disguise as a legitimate one to cheat the classifier. However, for short messages a good word the length of which is short would be preferred in attack since short message has length limitation. In this study, we investigate the good word attack and its counterattack method, i.e. the feature reweighting in short message spam filtering. Considering the length of short message has an upper limit, we proposes a good word attack strategy based on the combination of length and weight of features, which inserts the good words based on the weight values and also the length of words. On the other hand, the feature reweighting method with a new rescaling function based on the combination of length and weight of features is also proposed for short message filtering. The proposed methods are evaluated and analyzed experimentally by using tow real dataset. The results show that the proposed attack and defense method are more efficient than traditional methods for short messages.

Keywords/Search Tags:

Short Message, Good Words Attack, Feature Reweighting, Spam Filtering

PDF Full Text Request

Related items

1	Evasion Attack And Its Application In Chinese Spam Message Filtering
2	SPAM Short Messages Filetring System Design And Implementation
3	Based On Wince Terminals Spam Filtering System Design And Implementation
4	Research On Key Techniques Of Spam Short Message Filtering
5	Research On Shielding Mechanism Of Short Message Spam And It's Application
6	Design And Implementation Of Spam Short Message Recognizing System
7	Analysis And Application Of Spam Sms Filtering Experimental Platform
8	Design And Implementation Of Short Message Intelligent Classify Based On Contents
9	Research On LSTM-Based Social Network Spam Filtering
10	Research And Implementation Of Content-Based Spam Filter Technology