Font Size: a A A

Multilingual Document Classification Based On Bayesian Algorithm

Posted on:2017-10-19Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhuFull Text:PDF
GTID:2348330512456317Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As time goes by,information technology has rapidly developed and tends to get matured.The way for people to obtain information is no longer only through newspapers or through the way of mouth-to-mouth talking.Instead,people now acquire information through varieties of media,such as TVs,computers,mobile phones and so on.However,with the mass data,people have raised their expections how to get useful information in a very short time.Thus it seems more urgent whether information can be effectively organized and managed.Since the traditional system of single-language text classification cannot meet the demands of people when classifying information,so it becomes especially important to do multilingual text classification,which can classify mass information accurately and quickly.After stating the course of development of text classification and making comparisons among the Bayesian algorithm,the K-nearest neighbor algorithm and the Rocchio algorithm,this paper designs and implements the system of multilingual text classifier based on the Bayesian algorithm.After the system performance has been tested roughly,it turns out that this system can do the classification.
Keywords/Search Tags:text classification, Naive Bayes, multi-language
PDF Full Text Request
Related items