Font Size: a A A

SPAM Short Messages Filetring System Design And Implementation

Posted on:2013-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:S J WuFull Text:PDF
GTID:2248330371467164Subject:Cryptography
Abstract/Summary:PDF Full Text Request
With the development of communication services, mobile communications services has also been rapid development. The number of phone users and mobile phone penetration is increasing rapidly. In the country, to the end of the September 2011, the number of mobile subscribers has reached 952 million and telephone penetration rate reached 71.1 per hundred person. At the same time, SMS has become the main means for information transfer and communication, because its short, quick, simple, inexpensive, and many other advantages. When people enjoy the convenience of text messaging, they are also affected by spam short messages. Some spam short messages amok, including pornography, fraud, intimidation, harassment, advertising and other illegal content. Spam messages seriously interfered the daily life of mobile phone users, waste the network resources, bring the potential harm to society. Short message filtering has become a hot topic in academic research, and also the urgent needs of mobile phone users.The dangers, definitions, classifications, characteristics and stage management of spam short messages, as well as the basic way and methods for spam short messages filtering are introduced in this article. Black/white list filtering and text classification filtering are described in detail. Secondly, the key technology of text categorization filtering, including text pretreatment, word segmentation technology, feature extraction and text categorization algorithm, are introduced. Study and implementation the TF-IDF, MI, IG, CHI feature extraction and KNN, Bayes classification. Based on evaluation, analysis and choice a text classification algorithm for short message filtering. Finally, combined the black/white list filtering and text classification filtering, design a short messages filtering system in the side of mobile phone, and supplemented by server. Implement in windows mobile system.In this article, the major work includes:Collected advertising, fraud, bad, illegal and other types of spam short messages and nonnal short messages, created a sample library for short message. The sample library contains 600 spam short message and 600 normal short message.Based on feature extraction of TF-IDF, MI, IG and CHI,implement KNN and Bayes filtering. F index of KNN is 97.7%, and F index of Bayes is 96.1%. As the Bayes don’t need to save short message samples on the phone side, so finally choice Bayes classification filtering to save phone resources.The system separate the training of the samples and the filtration. With server providing the training result of the short message samples to mobile phone, mobile phone reduce the amount of computation, and save a lot of space.System is divided into server and mobile client, and provide samples feedback learning. Mobile client can get the latest features thesaurus, and feedback short messages which are classified wrongly to the server in order to achieve information sharing.The designed filtering system was implemented in windows mobile phone operating system. Using Cellular Emulator sending a short message, system can intercept and filter short messages accurately.
Keywords/Search Tags:Spam Short Message, Text Categorization, Bayes, Feature Extraction, Short Message Filtering
PDF Full Text Request
Related items