Font Size: a A A

The Study Of Naive Bayes Text Classification System Based On Artificial Intelligence

Posted on:2006-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q ZhouFull Text:PDF
GTID:2168360152996626Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Tremendous informantion appears in internet, digital library and intranet of company as text format with the coming of informantion times, especially with the influence to people's life of internet. How to obtain the needed information quickly and accurately becomes a study hotspot in the field of information processing. The technique of text classification based on artificial intelligence(AI) seems as one of approaches to solve such problems. This thesis aims to discuss the text classification from the point of view of classification theory, algorithms modification and realization.At first, the traditional solutions to some key technical problems in the field of text categorization are studied, also core techniques and system architecture of the typical text categorization systems are discussed, and then the applications of text categorization are summarized in this paper.From the point of view of statistics, the traditional statistical text classification methods are powerful, but they are often based on assumptions that do not hold for the real world data and the results can be hard to interpret. They come up with a high precision which may not be necessary in any case but can cost a lot. Furthermore there is need for fundamental mathematical knowledge to use these approaches. And then Naive Bayes classifier which is a simple but powerful type of classifiers based on statistics is studied profoundly. In fact, there are inevitable semantic association in the context. That is to say, the individual words in each document don't hold the condition that they are independent and identically distributed. Therefore, the strong conditional independence and distribution assumption underlying Naive Bayes classifier can sometimes not only lead to poor classification performance but do not hold for the real text feature vector.Aiming at the shortcomings underlying in the Naive Bayes algorithm, fuzzy system and neural network are introduced to text information processing to improve Naive Bayes classification performance by getting rid of its disadvantages and combining its advantages separately. That the prior knowledge(rule-based) can be used in fussy system which is similar with supervised text categorization and the study capability of neural network which can buildup the adaption to modified environment are studied particularly to amend Naive Bayes algorithm. And then a Naive Bayes classifier based on AI is realized. The experimental results demonstrate that the amended algorithm not only raises classification accuracy greatly, but ameliorates the smoothness of accuracy distribution for each category so that get the classification results similar with artificial methods.
Keywords/Search Tags:Text Categorization, Naive Bayes, Fuzzy System, Neural Network, AI
PDF Full Text Request
Related items