Automatic Summarization Of Academic Literature Based On Deep Learning

Posted on:2019-04-30

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Z Wang

Full Text:PDF

GTID:1368330572968601

Subject:Management Science and Engineering

Abstract/Summary:

PDF Full Text Request

With the arrival of big data era,online academic resources are growing explosively,where an increasing number of scholars are getting stuck in the vast ocean of literature.Therefore,how to automatically summarize a collection of literature in a particular discipline into a concise yet comprehensive report has become one of the hot issues in current study and practice of the knowledge management.As an important technology for the natural language processing,automatic summarization presents the most critical information in a way that is concentrated and close to users' needs,with the purpose of helping researchers achieve the goal of "standing on the shoulders of giants".This paper focuses on how to improve the automatic summarization of academic literature,and develops a research system for "Automatic Summarization of Academic Literature based on Deep Learning".Specifically,this system involves a series of theories and methods about the deep learning such as text representation based on neural networks and automatic summarization based on Seq2Seq models,as well as the classical text mining algorithms like LDA and Labeled-LDA two statistical topic models,and PageRank and PageRank with Priors two link analysis methods.As for the numerical experiment,this study selects a considerable portion of literature on the computer science from the ACM(American Computer Association)digital library to validate the proposed model.The main contents of this paper are as follows:1.This paper formulates the "literature review generation" into a problem of sequential text generation,and then proposes a Seq2Seq model based on hierarchical neural networks that mainly consist of a hierarchical document encoder and an attention-based decoder.To be specific,the encoder derives the semantic representations for the sentence-level and document-level through the CNN and RNN respectively,which not only reflects the hierarchical nature of an article correctly,but also avoids the vanishing gradient and the information loss that are both caused by long word sequences.During the decoding phase,both the saliency and novelty of each candidate sentence are considered simultaneously to minimize the redundancy of a generated summary when maximizing its representativeness.2.As the literature review is context-aware,this paper puts forward a Seq2Seq model which fuses the contextual information.For characterizing the context relevance between each candidate sentence and its target document more accurately,the Labeled-LDA is utilized first to infer the topic distribution of each sentence,then the sentence topic is integrated into the encoding process of documents,and finally the source texts are also encoded at the same time to be included into the decoding phase.3.Since statically analyzing the context relevance cannot satisfy the fact that the text corpus is changing dynamically,this paper investigates the importance of graphic context toward the "literature review genreation" from the perspective of information network,and then raises a Seq2Seq model with a joint context-driven attention mechanism.Specifically,the Node2vec is employed first to vectorize every node of the heterogeneous bibliography network,then the connectivity distance within the graphic context is measured for any pair of papers,and finally two different context relevance measured from both the texts and a heterogeneous bibliography network are introduced into the decoding period simultaneously.

Keywords/Search Tags:

Deep Learning, Automatic Summarization, Seq2Seq Model, Context Relevance, Heterogeneous Bibliography Network

PDF Full Text Request

Related items

1	Research And Implementation Of Multilingual Automatic Summarization System Based On Deep Learning
2	An Automatic Summarization Model Based On Deep Learning For Chinese
3	Research And Application Of Automatic Text Summarization Technology Based On Deep Learning
4	Research And Implementation Of Automatic Text Summarization Based On Seq2Seq Model
5	Research On Automatic News Summarization Technology Based On Deep Learning
6	Research And Implementation Of Text Automatic Summarization Based On Deep Learning
7	Research On Short Text Automatic Summarization Method Based On Deep Learning
8	Analysis And Implementation Of Text Summarization Based On Deep Learning
9	Research On Text Summarization Generation Technology Based On Topic Model And Variational Self Coding
10	Research On Abstractive Text Summarization Based On Deep Learning