Font Size: a A A

Research On The Classification Method Of News Text

Posted on:2021-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:J R ZhangFull Text:PDF
GTID:2428330602965448Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the progress of industrial Internet,the network has become an indispensable platform and tool for people to exchange ideas and express opinions.Internet new media tools have become an important tool for the party,the government and enterprises to strengthen public opinion guidance and promote innovation in social governance.Due to the complexity and diversity of network contents,it is necessary to classify them to meet the needs of users for the analysis of network public opinions.The text classification methods we are familiar with are typical traditional classification methods and deep learning methods.The traditional classification method not only causes the high sparsity and dimension disaster of text feature vectors,but also loses the word order information of text and causes text redundancy.Therefore,the expression of feature vectors constructed according to the traditional construction method is not accurate enough.In view of the requirement of accuracy,external features of sentences are used as the correlation index to filter out irregular non-news event spam information,and then the retained standard format news event information and its semantic characteristics are automatically extracted based on deep learning,and classified at the level of news topic.Through the above planning,this paper lists the following work to be studied:1.This paper proposes a method to construct feature vectors according to the external structural features of sentences.Weibo has nothing to do with non-standard message and the characteristics of spam in weibo and the headlines as the research object,depending on the amount of news text and extract features a word of sentence patterns,emotional tendency,special,and special characters such as 12 kinds of external features,have external format of weibo news text characteristics and untreated weibo news text has more obvious distinguish,and targeted to solve the sparse of traditional text,and text dimension feature vector expression of the disaster.According to the external characteristics of a variety of sentence types,as well as the application of machine learning classification methods to improve the classification generalization performance,to achieve the dichotomy of the microblogging news text finally chose random forest methods in this experiment,so as to achieve to extract text feature related standards as well as the reference of the effect of the spam filtering has nothing to do with the news.2.A deep learning text classification method integrating CNN and GRU is proposed to process news text.On the basis of the construction of the above mentioned features vector and then get a characteristic value input to improve the classification model for classification in weibo news,because a lot of text has the characteristic of the diversity of the semantic information sparse theme,this paper puts forward a model of the depth of the C-GRU helped can solve above problems and on weibo news subject classification purposes,the specific reason has the following two aspects,one is because of deep learning model of text feature extraction and classification yes features at an organic whole,therefore,through training the C-GRU helped not only can reduce the workload of preliminary characteristics of the project at the same time also can achieve the result of classification,The second reason is that c-gru itself carries a portal forgetting structure.It adds less noise when collecting feature vectors and expands the richness of vector set,which makes it easier to collect and correlate news text keywords directly related to the topic,so as to effectively realize the classification of news text through emotion words.3.Through comparative experiments,the improved method used in this paper has improved the corresponding measurement indexes of weibo news text classification by10%-15% compared with the traditional 8F type.Then,when the text with external features of sentences is applied to the deep learning text classification,the classification accuracy of the improved c-gru model in this paper is about 5% higher than that of CNN.
Keywords/Search Tags:External features, deep learning, text classification, news and public opinion
PDF Full Text Request
Related items