Research On Text Classification Based On Deep Learning And Topic-driven

Posted on:2020-12-07

Degree:Master

Type:Thesis

Country:China

Candidate:W Y Gong

Full Text:PDF

GTID:2518306218969959

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Text classification is the key technology for text data mining and knowledge acquisition in natural language processing.With the rapid development of the Internet,text data has exploded,the number of topics has increased dramatically,and text classification has become difficult.How to efficiently manage massive amounts of text data based on themes,and to classify disorganized text data into clear topics for orderly management has become an urgent problem to be solved.The topic driver refers to determining the theme for the text data to be classified according to the subject-specific text data.With the deep learning in the field of image processing,speech recognition and computer vision and other aspects of feature capture,this study applies deep learning technology to the news text classification task,based on CNN,LSTM to develop text classification research,mainly to complete the following jobs:1.For the text representation of the traditional word bag method,the feature dimension is sparse,and it is impossible to represent the context information.The Skip-gram model in Word2 Vec is used to map each word in the document to the real value vector of the fixed dimension,effectively avoiding the tradition.The word bag method cannot characterize issues such as contextual information.Experiments show that when the word vector dimension is 300,the classification accuracy is the best.2.For the problem of deep feature extraction in the classification of news texts,this paper improves on the basis of CNN and Bi LSTM,integrates the advantages of CNN and Bi LSTM,and obtains the Bi LSTM-CNN model and applies it to the news text classification task..Experiments show that the Bi LSTM-CNN model has better classification accuracy than a single CNN or Bi LSTM.3.For the low accuracy of Bi LSTM-CNN in news text classification tasks,thispaper uses Bi LSTM-CNN model to extract features,uses XGBoost to classify extracted features,converts weak classification problems into strong classification problems,and experiments.The results were compared with Naive Bayes,SVM,KNN and the classification model of XGBoost is better than Naive Bayes,SVM and KNN.

Keywords/Search Tags:

Text Classification, Deep Learning, Convolutional Neural Network, Bi-directional Long Short-Term Memory, XGBoost

PDF Full Text Request

Related items

1	Short Text Sentiment Classification Based On Deep Learning
2	Research On Text Classification Based On Deep Learning
3	Research On Key Problems In Text Classification Research Based On Deep Learning
4	Research On Text Classification Based On Deep Learning
5	Research On Text Sentiment Classification Method Based On Deep Learning
6	Research On Short Text Classification Method Based On Contextual Feature Expression
7	Research And Implementation Of Multilingual Text Classification System Based On Deep Learning
8	Application Of Deep Learning In Financial Time Series Classification
9	Research On Sentiment Classification Of Weibo Text Data Based On Deep Learning Algorithm
10	Research On Text Emotion Classification Algorithm Based On Deep Learning Technology