Font Size: a A A

Web Document Classification Based On Discriminative Learning And Multiple Classifiers Ensemble

Posted on:2009-10-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y F ZhangFull Text:PDF
GTID:2178360242974810Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
It's an urgent problem to be solved about how to discover useful information from plentiful web resource. Web document classification is a key technology of the solutions and is conducive to the development of web search and filtering. Naive Bayesian is an important one among many web document classification algorithms. It's simple and efficient, but it's accuracy needs to be improved. This paper discusses how to improve web document classification accuracy based on Naive Bayesian method. The paper's main contributions are:(1)we introduces a discriminative Naive Bayesian parameter learning method whose object optimization function is the K-L distance between experimental distributing and truth distributing of the data into web document classification. At the same time, we take notice of the hierarchical relation between categories and combine it with the discriminative learning and thus we get a new discriminative Naive Bayesian hierarchical classification method based on K-L distance. In our Chinese web data collection, the new method performs well.(2)We divide each web into several parts according to its structure and train one classifier for each component and then combine them into a big one through some measures. Our experiment results show that multiple classifiers ensemble method is feasible. After combining, the classification accuracy is higher than before no matter it's plain classification or hierarchical classification. In the four ensemble methods we used in the paper, independence principle and maximum principle achieves higher accuracy than voting and weighted sum method.
Keywords/Search Tags:Naive Bayesian, Discriminative Learning, Hierarchical Classification, Web Document Classification, Multiple Classifiers Ensemble
PDF Full Text Request
Related items