Font Size: a A A

Research Of Chinese Text Classification Based On Naive Bayesian Method And Application Of Microblogging Data Classification

Posted on:2016-08-23Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2308330476454799Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Now Sina microblogging registered users reached more than 1 billion,there is a very great potential value in microblogging data.But now, we do not take manager of the large amounts of data.We need to get these useful information.we classify the microblogging data based on Naive Bayes.The result will produce a huge commercial value.We mainly introduce the text classification in the article.The research on text classification could be tracked back to the 1960 s.The early text classification is mainly based on Knowledge Engineering, which costs much more time and work to classify the text by manually defined rules, and to write appropriate rules, sufficient knowledge of a particular field should be had. In 1990 s, with the abundant appearance of the on-line text of internet and rising of machine learning, The automatic method of text classification based on machine learning was becoming main stream.There are many methods of text classification, and Naive Bayes classifier is more widely use of text classification.This paper first briefly introduces the content and method of text classification. Secondly, This paper introduces some methods of feature extraction. For example, Document Frequency and term frequency–inverse document frequency. Training Naive Bayes classifier by document frequency and term frequency–inverse document frequency and comparing the results. This paper detailedly studies text classification based on Naive Bayes, then introduces the Bayesian text classification project. Finally, the author shows prospects for text classification.
Keywords/Search Tags:text classification, Naive Bayes, classifier, feature extraction
PDF Full Text Request
Related items