Font Size: a A A

Research On Automatic Summary Generation For News Based On Deep Learning

Posted on:2019-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:M Q HanFull Text:PDF
GTID:2518306473953999Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
News websites are facing the problem of explosive growth of data,while providing mul-tiple open services for the public.It is increasingly important for news websites to obtain and display their important contents in a comprehensive and accurate manner from vast amounts of data.In the process of obtaining valid information,the extraction of text content and the generation of summary are two key technologies.Among them,the text representation pro-vides the basis for the text summarization.Textual representation generally take the form of counting.This method ignores the semantic information of texts and introduces many human efforts in the process of feature selection,which leads to high sparseness and high dimension-ality of extracted features and reducing the validity of textual representations.In addition,there will be many similar articles describing the same things on a same topic,making it hard for readers to quickly obtain key information from massive and redundant information.This also poses new challenges for multi-document summarization task.Especially when there is a serious shortage of labeled summary data,the traditional summarization methods need to spend a lot of time conducting unsupervised learning to construct graph models,which is inefficient.In recent years,the innovative design of semantic-based text representation and document summarization algorithms have become new research focusThis paper studies text representation and multi-document summarization based on deep learning and related technologies,and proposes a improved text representation,an accurate and efficient multi-document summarization model and a text style transfer model,and ulti-mately designs and implements an automatic summary generation system for news based on deep learning.The main work and innovation of this paper include three aspects as following1)In terms of feature extraction,feature automatic learning method is applied to the text representation.Word vectors are trained using the Word2Vec model,avoiding dimension curse,reducing a large number of feature projects,and alleviating the problem of unregistered words.It lays a foundation for text representation in multi-document summarization2)In terms of document summarization,this article focuses on the task of automatic summarization of multiple documents.Its function is to generate a concise summary based on a set of topic-related documents.Different from the traditional graph-based algorithms,this paper incorporates the idea of deep learning.By constructing a hierarchical auto-encoder language model,we can extract high-level and abstract features from the shallow features and propose a new automatic abstractive multi-document generation model based on deep learn-ing(HierAutoEncoder).A general multi-document summary can be generated by this model In order to improve the quality of the generated summary,this paper proposes two more mod-els VS-HierAutoEncoder and WC-HierAutoEncoder which are a combination of abstractive and extractive summarization methods based on HierAutoEncoder model.The effectiveness and effectiveness of the algorithms have been verified from experiments.HierAutoEncoder achieves good results among abstractive multi-document summarization methods,and WC-HierAutoEncoder achieves good results as well among extractive multi-document summa-rization methods.3)In terms of textual adaptability,a deep learning model is adopted to make the system's versatility enhanced.Combined with the need of automatic summary,this paper proposes a text style transfer algorithm based on deep learning.General news summary can be trans-ferred into a stylistic summary by the algorithm to adapt to different platforms(eg.Weibo),which not only ensures the timeliness of news release on the platform but also to ensures the style and adaptability of the news text.
Keywords/Search Tags:multi-document summarization, deep learning, recurrent neural network, auto-encoder network, text style transfer
PDF Full Text Request
Related items