Font Size: a A A

Research On News Text Classification Model Based On Deep Learning

Posted on:2020-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:L JiangFull Text:PDF
GTID:2438330575959326Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Text classification as a basic technology of information processing has always been a research hotspot in the field of natural language processing.The key steps in the text classification will affect the classification results,including text preprocessing,text representation,feature selection and classification algorithms.The algorithms involved are the focus of scholars' research.With the rise of deep learning,many network models excel in the field of text categorization.News text data has the characteristics of easy access,large amount of data,etc.The research on news text classification technology is relatively low cost and belongs to a kind of basic support technology.Therefore,the classification of news texts has a very important impact on news recommendation,data news,advertising push and other fields.In order to improve the classification accuracy of news texts,the main work and innovations of this paper are as follows:1.This paper deeply studies and introduces the basic process of text categorization in the field of natural language processing,and explains in detail the machine learning techniques and deep learning techniques involved in the process.In the process of text representation and feature selection,this paper selects the word embedding method for the characteristics of news text,and uses Word2Vec tool to represent text data.This model can guarantee the semantic relationship of word vector and avoid dimensional disaster problem and improve classification performance.2.Based on the previous work,the SRB text simplification model and the nested LSTM model are studied and improved respectively,and a hybrid model based on text simplification method is proposed.First,we simplify the news text step by step through the SRB network and generate simple sentences with high semantic relevance which simplify the training of models at the later sentence level without losing semantic information.Second,the sentence enters the NLSTM network to learn the semantic relevance between sentences and the feature representation.3.The hybrid model proposed in this paper also adopts the attention mechanism to highlight the feature expression of key sentences.It can not only adapt the characteristics of news texts,but also highlight the role of key sentences while acquiring the relevance of contextual features.The idea combines the advantages of each model.4.We compare the hybrid model proposed in this paper with five typical deep learning models,and design multiple sets of contrast experiments on three popular Chinese news data sets.Experiments show that the proposed model achieves the most advanced classification accuracy..Finally,through the adjustment of the parameters,the influence of the parameters on the results is explored.
Keywords/Search Tags:text categorization, LSTM network, attention mechanism, text simplification, hybrid deep learning model
PDF Full Text Request
Related items