Font Size: a A A

To Achieve A Highly Effective Spam Filtering System

Posted on:2011-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:J ShiFull Text:PDF
GTID:2208330332477204Subject:Software engineering
Abstract/Summary:PDF Full Text Request
At present, communication field mobile phone receive rubbish quantity of message become spiral to increase, if not this phenomenon keeps a lookout, will lead to the fact the rubbish message overflows, harass the cellphone subscriber seriously, as well as great inconvenience to normal communications. Some illegal short messages even have done great harm to our society.This thesis presents a major research applies to spam messages sent over the Internet to conduct monitoring and filtering solutions, the program is based on sending message frequency, message content, message length features, using fuzzy matching and frequency of monitoring a combination of methods to monitor and filter messages. Have proposed the content and varied in the system from the viewpoint of author, the key word weighting control methods such as frequentness, etc, Through in the numerous experiments of the computer lab, the result shows, system this filter accuracy and person who judge by accident two of range indexes raise by 22.1%,30.3% respectively.In the SMS information, this collection of 5,000 messages, including normal text messages and spam messages and messages to choose from some of the content of in-depth analysis, summarized extract the normal characteristics of SMS text messages and spam, the filter for future research laid the foundation. These messages will also be information at the same time as the experimental system for research and test data.In the system framework design, first consider the realization of the principles of the ways and filtering. And briefly describes the current management of spam text messages the two most common algorithms, namely:According to the text content filtering, blacklist containing the number of SMS text messages to filter (blacklist filtering). Through the introduction of these two basic algorithms, pointing out their shortcomings. And based on mobile phone users on SMS spam filtering requirements, the principles put forward spam SMS filtering, namely, "would rather let not victimizes." This principle to a certain extent, an increase of spam messages to determine the degree of difficulty, will result in a miscarriage of justice, and Lou Pan spam messages.Filter algorithm in the core areas and draw on the current maturity of SMS spam filtering methods, based on the current anti-spam SMS filtering, as well as new trends in spam SMS filtering algorithms for the existing shortcomings and put forward the spam messages of this highly efficient filtration system, selected the three kinds of filtering methods, namely, pre-processing algorithm for text content, keywords, send a frequency-weighted control algorithm, message content length of the sending frequency of the correction algorithm. Among them, message content preprocessing algorithm can effectively solve the general keyword matching algorithms are vulnerable to replace the contents of the shortcomings of interference; In addition, in order to avoid relying solely on keyword filter easily lead to a miscarriage of justice, the system uses the keyword to send the frequency weighted control algorithm, so that misjudgment were significantly reduced; again, combined with the length of the characteristics of spam messages it sends to adjust the frequency.
Keywords/Search Tags:SMS spam filtering, fuzzy matching, frequency control
PDF Full Text Request
Related items