Font Size: a A A

Research On Chinese News Headlines Based On LSTM-attention

Posted on:2020-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:S YangFull Text:PDF
GTID:2428330596481783Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
According to the "Statistical Report on the Development of China's Internet Network" released in mid-August 2018,China's Internet users in the first half of 2018 have exceeded 800 million,of which mobile Internet users can account for 98%,which means people.In the era of mobile Internet,the preference for information acquisition has gradually shifted from the traditional Internet to the mobile Internet,and people's preferences for mobile terminals have become stronger.According to the statistics of Chinese Internet users on various Internet applications,more than 660 million Chinese netizens use online news applications in their lives,and their usage rate ranks second in all application categories,second only to instant messaging.application.With regard to the development prospects of the news industry,it is very worthwhile to study how to combine cutting-edge technologies such as online news related technologies and artificial intelligence at the end of the mobile Internet era to present better and more valuable content for users.News is an important way for people to understand current affairs and the latest industryrelated news.The information classification of news helps to realize the orderly news,and the news texts are mined to guide the decision-making.Since news classification is essentially a text classification problem,and text classification is an important direction in the field of natural language processing,so far,people have studied a lot of news text classification.At present,most text classification problems are limited to the use of conventional machine learning algorithms,so the research and application of deep learning algorithms is still limited,so this paper aims to study this aspect of the problem.This article uses news headlines as the entry point for news classification.Since the news headlines belong to the short text category in the text,this paper mainly uses the deep learning related method to classify the short texts of news headlines,and the problems that have appeared in the previous classification,combined with deep learning.The network introduces the latest Self-Attention model to solve common problems in news classification and short text classification.For solving the specific problem of news headline classification,this paper chooses to use the self-focus mechanism Self-Attention to process the LSTM word vector input sequence and enhance the classification effect of LSTM.For the experimental dataset,there may always be some new words in the news headlines,so the dataset of this article is selected by the NLPCC 2017 Task2 Chinese News Headline official dataset and the portals obtained through crawler technology for nearly a year,today,headlines,Sina and other portals.A complementary data set consisting of various news headlines.After the experiment,the model has achieved a good classification effect on the classification of news headlines.Compared with the commonly used short text classification models Bi-LSTM,CNN-LSTM,LSTM-Attention and CNN-Attention,the classification model for Chinese The classification accuracy of the short text of the news headline finally reached about 85%,which is close to 86% of the classification accuracy level.Overall,it still has a good classification effect.
Keywords/Search Tags:Chinese news headline, short text classification, LSTM, Self Attention
PDF Full Text Request
Related items