Font Size: a A A

Research And Improvement On Na(?)ve Bayes Test Classifier

Posted on:2006-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:M X ZhangFull Text:PDF
GTID:2168360155474264Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with the development of network information, automatic information classification has been an essential tool to gain useftil information. As a classification method, Naive Bayes classifier has been applied to many fields. The advantage of Naive Bayes method lies in the usage of prior information, which could provide a pattern and management method under the incertitude logic.First of all, this paper described text classification system, the content includes text information expressing, extracting and themethod of text classification. Subsequently article discussed Bayes classifier model and algorithm.And then, this paper introduced data sparse condition of Bayes classifier. Disadvantage of laplace smoothing used by traditional Naive Bayes classifier has been pointed out, Another smoothing method is advised to replace Laplace. The advised method just is statistical language model(uni-gram) smoothing: Jelinek-Mercersmoothing , Dirichlet smoothing and Absolute-discounting smoothing.Mostly, we have improved Bayes classifier with new smoothing methods of statistical language model, that is to say, we replaced laplace smoothing of traditional Bayes classifier with other three smoothing methods of uni-gram model. Specific algorithm and framework have been shown. In order to affirm our work, we test three uni-gram smoothing used in Bayes classifier, select suitable parameter value. Our experimental results show that using a language model, we are able to obtain better performance than traditional Naive Bayes classifier.In the future, we should improve Bayes classifier with statistical language model (Bi-gram model and Tri-gram model) smoothing.
Keywords/Search Tags:Naive Bayes text classifier, data sparse, laplace smoothing, statistical language model, uni-gram smoothing
PDF Full Text Request
Related items