Study On Email Classifying Technique Based On Data Mineing

Posted on:2005-03-11

Degree:Master

Type:Thesis

Country:China

Candidate:Y Li

Full Text:PDF

GTID:2168360125463895

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

With the widely use of Internet, Email ,a main communication way , are mostly used by people. But the Spam coming with it becomes big troubles. The USA lost 10 hundred millions ever year by statistic suffered from Spam. From the statistical report of Chinese Internet development actuality, issued by Chinese Internet Center at 2003.7,we can get that ever 16 emails received by Chinese net-user contains 9 Spams, which exceed normal emails. In our country lots of Spams engross net bandwidth and make Email Server stop working. Spam seriously disturb people's normal uses for its forcly , cheatly, unhealthy and repeatly character. The Spams waste people's time money and vigor, transmit eroticism content, spread fallacies to deceive people, that make big troubles to society.However with the fast increasing of spam ,the anti-spam technology stop going ahead .As the current anti-spam technology lack of aptitude and autolearning, it can not identify new Spam by learning from the former Spam instances. Althoughe some anti-spam technology have auto learning character, such as Byesi filter technology, it only work on the content of email and ignore email's head fields ,which is the most shortage of this technology.In My text I choose Data mining technique, suggestted by Byesi filter technology, to study an autolearning anti-spam technology. Data mining technique has become the core technology of the intelligence commerce. It has been widely used in many areas and drawn the attention of the whole academe. Some algorithms and techniques of artificial intelligence, including determination Tree and neural networks, have been applied in data mining to do prediction, pattern recognition, classification and Clustering.After analysis and study on email, by disperseing and charactering email, my paper use vector to express email. And then bring forward a determination tree classifying model base on information entropy. At last I do a series of experiment and testing. The result of experiment and testing prove that the model can find how to identify the new Spams by learning from the field , network , structure and content informations of emails. It shows that our model and method work well.

Keywords/Search Tags:

Data Mining, information entropy., determination tree, spam

PDF Full Text Request

Related items

1	Research On Filtering Technology Of Spam Communication Behavior Detection Based On Decision Tree Algorithm
2	Research And Application On Decision Tree In Data Mining
3	Improvement And Application Of Decision Tree With Covariance&Information Entropy
4	Research On Attribute Reduction Algorithm Based On Decision Tree And Information Entropy
5	Analysis And Design Of Insurance Industry Information System Based On Data Mining
6	Research On Application Of Data Mining Technology In College Teaching Study Based On Inductive Learning
7	The Research On The Algorithms Of Optimizing Decision Tree Classification
8	Research On Spam Behavior Patterns And Recognition Methods
9	Research On Technology Of Data Mining Algorithm Based On Decision Tree
10	Decision Tree Model Based On Generalized Information Entropy And Its Application In Performance Evaluation