Font Size: a A A

Filter Spam Messages Based On Text Classification Algorithm

Posted on:2009-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:N GuanFull Text:PDF
GTID:2208360245961101Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Short message has become the very common communication tool gradually for the good mobility, low price, entertainment and convenience. But junk short messages have become more severe. The flood of junk short messages has greatly disturbed people's life and also brought great loss to mobile operators. Therefor, the research of intelligent monitoring technologies is of great significance.If existing filtration methods are adopted, the short message Center is required to analyze the short messages passing from it one-by-one, to determine whether the messages are spam or not. It is accurate but less efficient. At the same time, the filtration methods based on keywords or depended on the content of the messages have a mass of operations, which will cause short message service center network jam. Accordingly, they will give up filtering some junk short message or delay to transmit messages, which will reduce the accuracy and efficiency of the spam filtering.To overcome the shortcomings of existing filtering technologies, a multi-layered filtering algorithm of junk SMS is proposed. The concept of user's confidence is introduced, and messages are filtered by SMS center according to user's confidence. Three kinds of filtering technology (black/white list based, key words based, content based) are implemented on junk short message filtering method, which increase the efficiency very significantly. The accuracy is also greatly improved compared with a single filtering method. The main work includes:1. A discrimination technology based on behavior characteristics of short messages isproposed. As different users have different sent behavior, monitoring users' sentbehavior not only can provide real-time monitor about junk short message but also forecast the junk short message sent in the future.2. As the majority of cell phone users are not spam messages maker, the concept ofuser confidence is introduced. Based on the user's sent behavior, differentconfidence levels are defined for the users.3. Propose a random testing method according to user's confidence. This method can increase the junk short message filtering accuracy as well as improve the filtering efficiency of the SMS center.4. The multi-layered system is implemented based on the sampling monitoring method and the available junk short message filtering technology. Experiments show that the proposed new method is effective in filtering junk short message.
Keywords/Search Tags:junk messages, text classification, feature extraction, short messages filtering
PDF Full Text Request
Related items