Research On Chinese News Classification Algorithm Based On Deep Learning

Posted on:2021-12-13

Degree:Master

Type:Thesis

Country:China

Candidate:M M Dou

Full Text:PDF

GTID:2518306521489244

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of big data era,in the face of the rapidly increasing number and complexity of text data,there is an urgent need to find more effective ways to classify and manage these resources.Text classification can effectively process text information and improve the utilization of information.News is the most effective way for people to get news and understand current affairs.Its content is mainly composed of unstructured text data.It is of great practical significance to study news classification,which is helpful to the development of news personalized recommendation,advertising push and other fields.This paper mainly uses deep learning technology to study news classification,the main work content is as follows.Firstly,it introduces the research background and significance of text classification,analyzes the research status of text classification at home and abroad,summarizes the existing problems at this stage,and then puts forward the corresponding improved algorithm from the perspective of news classification.Secondly,aiming at the problems of insufficient feature extraction,difficulty in processing sentence structure information and capturing long-distance dependence in traditional convolutional neural network for Chinese text classification,a hybrid neural network classification model based on TC-ABlstm(Text Convolutional Attention Bidirectional Long Short-Term Memory)is proposed.The model improves the traditional convolution neural network to enhance the ability to extract local features of text;and constructs a bidirectional long-term and long-term memory neural network model combined with attention mechanism to capture the global features of text context;finally,the advantages of the two models are combined to improve the accuracy of classification.Thirdly,aiming at the phenomenon of polysemy in the word vector trained by common pre-training models and the influence of word segmentation technology on Chinese text segmentation,we use the BERT model to represent the word vector.At the same time,considering that the text content of the news data is relatively long,and the BERT model is limited by the length of the text representation,in order to enhance the representativeness of the representative text,before using the BERT model for classification,the TextRank algorithm is used to extract key news sentence information.Finally,the two algorithms proposed in this paper are tested on two real data sets.The results show that the two models can effectively improve the accuracy of Chinese news classification.

Keywords/Search Tags:

deep learning, text categorization, convolutional neural network, BiLSTM, BERT model

PDF Full Text Request

Related items

1	Research On Chinese Text Classification Based On Deep Learning Theory
2	Research On Text Sentiment Classification Method Based On Deep Learning
3	Research On BERT-based Chinese Long Text Classification Algorithm
4	Study On Text Categorization Method Based On Graph Convolutional Networks
5	Research On Text Sentiment Classification Based On Deep Neural Network
6	Research On News Text Classification Model Based On Deep Learning
7	Research On Text Classification Model Based On Improved Graph Neural Network
8	Research On The Parallelization Of Text Categorization Based On Convolution Neural Network
9	Research On Text Classification Of Chinese News Based On Deep Learning
10	Research On Text Classification Based On Hybrid Model Of Deep Learning