Font Size: a A A

Design And Implementation Of News Text Classification And Automatic Digest System Based On Deep Learning

Posted on:2021-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y X PanFull Text:PDF
GTID:2428330620964052Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the constant updating and spreading of Internet information,the phenomenon of information overload is increasingly serious.Compared with images and sounds,the expression of news text is not intuitive,but requires manual reading comprehension,which makes it difficult to screen critical and effective information and seriously affects people's reading interest.Based on this,this thesis is based on news texts,using natural language processing technology and deep learning methods to process news texts to achieve fine-grained news text classification,and to generate news-generating summaries for news under specific news topics.The design and implementation of the news text classification and automatic abstracts system is convenient for users to not only browse relevant and interesting news reports in a more targeted way,but also read the refined and summarized news resources,effectively saving users' valuable time.The research contents of this thesis are as follows:1.To solve the problem of sparse feature of news headlines and strong context dependence in the classification of news texts,this thesis studies the classification method of news texts based on semantic enhancement.Firstly,Bert's language model is used to replace the traditional Word2 Vec for semantic representation,and then combined with the improved Bi-LSTM based on Self-Attention mechanism to adaptively extract context features.Based on the above structure,a semantic-enhanced classification model Multiple-Fusion is proposed.In addition,an improved word-level data enhancement strategy is designed to enhance the generalization and accuracy of the classification model.2.Based on fine-grained news text classification,an automatic text summary model based on generation is proposed for news summary.In order to solve the problem of lack of fluency and generalization ability in news summary generation,the vector fusion strategy FME is first proposed to represent the word vector,and then the Transformer network model based on Encoder-Decoder framework is used to extract text global features adaptively.The two structures are combined to form a generative summarization model FME-Transformer,which is applied to news summarization.Experimental results show that the generated news summaries are close to the level of human understanding.3.News text classification and automatic summarization system based on deep learning.Based on the above proposed algorithm and the xinhua net news data,the system integrates the core function modules of news collection,news text classification and news summary generation.The system has complete functions and friendly interface to meet the actual needs.In addition,this thesis verifies the usability and high efficiency of the system through functional and performance tests.The research results can be used for reference in the field of text classification and automatic abstracting.
Keywords/Search Tags:deep learning, natural language processing, news collection, news text classification, news summary generation
PDF Full Text Request
Related items