Research On News Text Classification Based On Deep Learning

Posted on:2021-04-27

Degree:Master

Type:Thesis

Country:China

Candidate:L F Wu

Full Text:PDF

GTID:2428330605956945

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet technology,the Internet has become an important platform for people to obtain information.In recent years,Internet users have been increasing day by day,and the number of news texts on the network has shown an explosive growth trend.How to efficiently classify and manage these massive news texts has become one of the current hot research topics.There are various forms of expression,the text structure is not standardized,and the text content is uneven,which increases the difficulty of text classification invisible.Therefore,there is a need for an efficient text classification algorithm to classify and organize massive amounts of text,and to extract valuable information from it.Deep learning,which is extended on the basis of machine learning,has its unique nonlinear computing ability,which can characterize the characteristics of text data and efficiently process text data.This paper proposes an improved convolutional neural network news Text classification model framework.The main work completed by the thesis is as follows:(1)In view of the defect that the word2vec word vector model only obtains local context semantic information of the word and lacks the overall semantic information,this paper proposes to express the text by combining the word2vec word vector with the LDA topic model.Use the Skip-gram model of the word2vec word vector to train the text,map the text into a low-dimensional and dense vector space,and then calculate the cosine similarity distance between each word vector to measure the semantic relevance of the word.When calculating the weight of the word vector,the part-of-speech weight factor is added to improve the calculation formula of the word vector weight,so as to assign a larger weight value to important words.Experiments show that the improved feature representation method can obtain word vectors with shallow semantic meanings.(2)Introduce a multi-layer perceptron in the convolutional layer of the traditional convolutional neural network to improve the computational power of convolution and obtain high-quality features.First,the obtained fusion feature representation is used as the input of the model,and the convolutional layer of the multi-layer perceptron is used to obtain the key features.Then,the pooling operation of the pooling layer is used to reduce the dimensionality and filter the feature data.Finally,the obtained quality is higher.The features are connected through a fully connected layer and classified using Softmax.The experimental classification results show that the accuracy of the design model in this paper is 92.4%,the recall rate is 91.9%,and the F1 value is 92.2%.The classification effect is good,indicating that the improved model in this paper can improve the efficiency of text classification.Figure[24]table[12]references[55]...

Keywords/Search Tags:

text classification, word2vec, LDA, MLPCNN

PDF Full Text Request

Related items

1	Research On Text Classification Based On Word2vec Word Vector
2	Research Of Text Classification Based On Word2vec And Self-attention
3	Comparison And Combination Of Text Classification Based On Word2vec With SVC And AT-LSTM
4	Research On Text Classification Based On Word2vec And Convolutional Neural Network
5	Research And Application Of Chinese Text Classification Technology
6	Research On Text Classification Based On Multi-factor Features
7	Research On Text Classification Algorithms Based On Word Vector
8	Text Modeling And Classification Based On Word2vec
9	Research On Chinese Short Text Classification Based On Word Embedding
10	Research On Text Classification Method Based On Bidirectional LSTM