Design And Implementation On The GPU Of Bayesian Text Classifying Algorithm

Posted on:2015-10-27

Degree:Master

Type:Thesis

Country:China

Candidate:C P Yang

Full Text:PDF

GTID:2298330467463540

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of the Mobile Internet and enterprise informatization, there has been more information from the Internet stored in the form of text. It is being a challenge and also an urgent problem about how to obtain valuable information from themassive data. This situation gave birth to the Text mining research field.Text mining can be used to retrieve or filter useful messages from unsupervised massive documents. The efficiency of text mining algorithms is closely related to data dimensions and size of the data set. If the data is too large, the performance of the algorithm will encounter bottlenecks. Data mining algorithms running on a single CPU have been unable to meet user demand.This paper mainly designs a parallel naive Bayesianclassifying system that can implement classifying works parallelly based on the principle of naive Bayesianclassifying algorithm, architecture of GPU and the programming model of CUDA (Computer Unified Device Architecture). This system can increase improve the efficiency of text data mining by fully using the compute power of GPU. This paper mainly completes the following works:First, this paper investigates the principle of naive Bayesian algorithm, architecture of GPU and the programming model of CUDA, summarizes and divides naive Bayesian algorithm into several steps and finds out the steps that can be implemented parallelly, then design a parallel naive Bayesianclassifying system that can be implemented parallelly. The system contains five modules like preparation module, text training module, text classifying module, classifying result evaluatation module and classifying result feedback module. This paper mainly does modifying works on the text training module and text classifying module. At the end, this paper does some efficiency improving works based on architecture of GPU.After testing the implementation of the4different data sets on the architecture of GPU combined with CPU, the test results show that parallel text classifying system implemented in this article achieves quite good acceleration effect.

Keywords/Search Tags:

document classifying, naive bayesian, CUDA, parallel computing

PDF Full Text Request

Related items

1	The Research And Implementation Of Parallel Algorithm For Bayesian Text Classification Based Spark Computing Environment
2	Design And Implementation An Of Document Clustering Algorithm Based On The GPU
3	Implementation Of Two-dimensional DFT Parallel Algorithm On CUDA
4	Design And Implementation Of Parallel SM4-GCM Based On CUDA
5	Research Based On CUDA Parallel Computation Of FFT
6	Research And Implementation Of The Gene Bayesian Network Construction Algorithm Based On Multi-core Environment
7	A Study On The Parallel Computing Methods Of Visual Hull Based On CUDA
8	Parallel Design And Implementation Of AP Clustering Algorithms Based On CUDA
9	Research And Application Of Naive Bayesian Classifier
10	A Parallel Image Stabilization Algorithm Based On CUDA