Font Size: a A A

Research On Deep Neural Networks Based Automatic Text Summarization

Posted on:2021-01-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q Y ZhouFull Text:PDF
GTID:1368330614950954Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Automatic Text Summarization has been being an important task and research trend for both Artificial Intelligence and Natural Language Processing.With the explosive growth of information on the Internet,users are expecting more and more applications for Automatic Text Summarization.Moreover,many new scenarios and tasks are requiring the higher performance of summarization systems,such as Search Engine,Smart Speaker,Virtual Assistant,etc.The continuing research on this topic has pushed Automatic Text Summarization to a new level.The method for constructing a summarization system has evolved from heuristic-rule based methods,to feature-engineering based statistical machine learning methods,and to the most recent deep neural networks based methods.As a new machine learning paradigm,deep neural networks have huge potential for Automatic Text Summarization due to its powerful representation learning,automatic learning of mapping between input and output,etc.However,there are still many research problems in this area.Therefore,this thesis focuses on some of these problems to explore how to improve Automatic Text Summarization using deep neural networks.In this thesis,taking sentence summarization and document summarization as the main line,we study the four important specific aspects of the two core issues of importance modeling and summary construction process.First of all,we propose a selective mechanism to solve the problem that the current abstractive sentence summarization models lack explicit modeling for the input imformation.The purpose of summarization is to identify important information and produce output summary.However,the attention mechanism in sequence-to-sequence architecture does not model the input importance explicitly.We propose a selective encoding mechanism to explicitly model this important information as a selection process.Specifically,the proposed selective encoding gate can judge the importance of each word separately according to the semantic meaning of the entire input sentence.Through this process,the information-selection process is well explicitly model in our propose method so that the decoder could produce output summary better and easier.Extensive experiments show that the proposed model can both improve the performance of the sentence summarization system,and help to identify important information.Secondly,we propose a sequential copying network model to tackle the problem that current copying methods lack the ability for extracting important information spans.The current copying methods can copy a single word from input to output.However,the current copying method cannot copy the entire span due to the single-word-copying paradigm.In order to address this issue,we propose the sequential copying network model for sentence summarization.The decoding process of the proposed method has two modes,namely the sequential copying mode and generation mode.In the copying mode,the model copies a full sequence using pointer networks based on a copying state hidden state.After the sequential copying,we propose a Copy Run mechanism to update the decoder state so that the decoder can switch between copying and generation modes smoothly.Experimental results show that the proposed model can copy important spans and generate fluently.Thirdly,we study the problem of unimportant and redundancy information in extractive document summarization,conduct extensive research for the cause and severity,and propose a hierarchical sub-sentential unit aware extractive summarization model.Current extractive summarization works use a full sentence as the basic extraction unit.However,this could introduce unimportant information and redundancy information issues.Moreover,it has not been studied explicated and quantitively.In this work,we conduct statistical analysis and human labeling to study this problem.Results show that unimportant and redundancy information issues exist even in the oracle systems.To this end,we propose a new paradigm of extracting sub-sentential units for extractive summarization systems.This method can split the important and unimportant information so that these issues can be alleviated.Experiments show that the proposed method improves the system performance.And the unimportant and redundancy information issues are alleviated compared to the system extracting full sentences.Finally,we propose a joint sentence scoring and selection model for extractive document summarization to solve the problem that these two steps are not well-combined.Previous works for extractive document summarization first measures the importance of sentences(sentence scoring)and then select proper candidates by certain strategies(sentence selection).These two separated sub-tasks cannot benefit each other in the existing methods.For example,the process of sentence selection has no influence on the sentence importance measurement.We propose an end-to-end neural networks model to jointly score and select sentences.Specifically,the sentences in the input documents are mapped to a list of real-valued vectors.Then,the model measures their importance using not only their contents but also the content of previously selected sentences.Moreover,wepropose a new loss function which can describe the subtle importance difference between sentences.Experiments show that the joint sentence scoring and selection model performs better than the separated baseline.And the proposed new loss function can further boost the extractive summarization system.This thesis proposes four methods to tackle the problems in automatic text summarization,namely,modeling of the important words during encoding,extracting important information spans during decoding,avoiding the unimportant and redundant information,and the combination of sentence scoring and selection.For each problem,we design specific neural network models and achieve improvements.Meanwhile,our work also provides new methodologies and perspectives for future automatic text summarization research.
Keywords/Search Tags:Automatic Text Summarization, Sentence Summarization, Document Summarization, Neural Network, End-to-End Model
PDF Full Text Request
Related items