Font Size: a A A

Text Categorization Based On Naive Bayes Method

Posted on:2019-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:Z CuiFull Text:PDF
GTID:2428330548960180Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of information,network information is rapidly raised.But the processed knowledge is relatively scarce.How to organize,classify,collect and solve the problem of data clutter is main content.Text classification is a very important research field in data mining and machine learning.It is a powerful means to organize and manage data.In this paper,this paper implements a text categorization method based on Naive Bayes algorithm,and the effect is verified.First of all,on the basis of the domestic and foreign research results,this paper analyzes the type of text classification,the general process of text classification and text vector representation.Then this paper summarizes Naive Bayes algorithm,K nearest neighbor classification algorithm and support vector machines(SVM).This paper also analyzes the application of Naive Bayes algorithm in text classification,and takes it as the theoretical basis of this study.Then,Naive Bayes model is used to classify text,and a text classification algorithm based on Naive Bayes is proposed.Based on this situation,this paper is aiming to solve problem of differences in reality news headlines and news content.This paper introduces the text summarization,text summarization and Naive Bayesian model which is combined with the proposed hot news classification algorithm based on text summarization.Finally,this paper takes the network news website as the data source,builds Hadoop distributed platform,through data crawling,and makes analysis of web page and web crawler.Chinese text segmentation complete basic data processing,and the Naive Bayesian model and SVM model classification results by simulation experiments verify the effect of Naive Bayesian model.At the same time,the text classification algorithm based on Naive Bayes is compared with the text classification algorithm based on Naive Bayes.This paper verifies the advantages of text classification algorithm based on naive bayes.
Keywords/Search Tags:Naive Bayes, text classification, text theme, text abstract, Hadoop platform
PDF Full Text Request
Related items