Research Of Chinese Text Classification Based On Naive Bayesian Method And Application Of Microblogging Data Classification

Posted on:2016-08-23

Degree:Master

Type:Thesis

Country:China

Candidate:C Li

Full Text:PDF

GTID:2308330476454799

Subject:Applied Statistics

Abstract/Summary:

Now Sina microblogging registered users reached more than 1 billion,there is a very great potential value in microblogging data.But now, we do not take manager of the large amounts of data.We need to get these useful information.we classify the microblogging data based on Naive Bayes.The result will produce a huge commercial value.We mainly introduce the text classification in the article.The research on text classification could be tracked back to the 1960 s.The early text classification is mainly based on Knowledge Engineering, which costs much more time and work to classify the text by manually defined rules, and to write appropriate rules, sufficient knowledge of a particular field should be had. In 1990 s, with the abundant appearance of the on-line text of internet and rising of machine learning, The automatic method of text classification based on machine learning was becoming main stream.There are many methods of text classification, and Naive Bayes classifier is more widely use of text classification.This paper first briefly introduces the content and method of text classification. Secondly, This paper introduces some methods of feature extraction. For example, Document Frequency and term frequencyâ€“inverse document frequency. Training Naive Bayes classifier by document frequency and term frequencyâ€“inverse document frequency and comparing the results. This paper detailedly studies text classification based on Naive Bayes, then introduces the Bayesian text classification project. Finally, the author shows prospects for text classification.

Keywords/Search Tags:

text classification, Naive Bayes, classifier, feature extraction

Related items

1	A Text Classifier About High Blood Pressure Based On Naive Bayes
2	Text Classification Method Based On Unsupervised Clustering And Naive Bayesian Classifier
3	Research On Text Classification Algorithm Based On Naive Bayes Method
4	Prediction Of Protein Contact Map Based On Weighted Naive Bayes Classifier And Extreme Random Tree
5	Research And Implementation Of Text Classification Technology Based On Bayesian Theory
6	Text Classification Algorithm Research Based On Naive Bayes
7	Text Categorization Based On Naive Bayes Method
8	Research And Implementation On Feature Extraction And Classification Of Chinese Text Based On SPARK
9	Research On Spam Text Classification Based On Improved Naive Bayes Algorithm
10	Design And Implementation Of Text Classification System Based On K-neighborhood And Naive Bayesian