Font Size: a A A

The Technology Of Automatic Text Summarization Based On Deep Learning

Posted on:2020-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:W N LiFull Text:PDF
GTID:2428330602951053Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and mobile internet,network information shows explosive growth.How to effectively and automatically acquire the required information has become a research hotspot of information science.Automatic summarization technology can compress and refine text information,which is one of the important means to solve this problem.Existing extractive automatic summarization technology make summary by grabbing keywords or reorganizing sentences from the original text.This technology is easy to implement,but summary sentences are not logical and fluent enough,the summary words are in the original text,so the language is not rich enough.Generative automatic summarization technology uses intelligent algorithm to understand text content and then generates highly logical and fluent summaries.At present,there are still some problems in general models,such as inaccurate abstracts and insufficient semantics,which deserve to be studied.Based on the deep analysis of the existing word vectorization methods and deep learning models,this paper proposes a generative automatic summary model based on BTWPS automatic encoder.The automatic encoder includes two parts: encoder and decoder.This paper mainly makes the following work from word vectorization,encoder and decoder:(1)Word vectorization.In this paper,we introduce two semantic features of keyword and part-of-speech.We propose a vectorization method of TF-IDF value and part-of-speech tagging,incorporating the TF-IDF value and part of speech tagging information into the basic word vector to form a new word vector.Compared with the original word vectors,this method highlights the keyword and part-of-speech features of words,improves the comprehension ability of word meanings,and ultimately improves the quality of summary.(2)Encoder.In this paper,we analysis the advantages and disadvantages of three kinds of gate structures of recurrent neural networks.Aiming at the problems of inadequate memory ability of general recurrent neural networks and inaccurate intermediate semantics of general automatic encoders,we introduced gated recurrent unit gate structure,bidirectional recurrent neural networks and attention mechanism.We construct a generative automatic summary encoder.Compared with simple encoders,this structure can generate more accurate intermediate semantics and improve text comprehension ability.and a generative automatic digest encoder is constructed.Compared with ordinary coders,this structure can generate more accurate intermediate semantics and improve text comprehension ability.(3)Decoder.In this paper,we deeply analyze the structure characteristics of multi-layer recurrent neural network decoder,propose a decoder based on state-layer and a vocabulary reorganization scheme of decoder with proximity words.The experiment determines the degree of proximity and explores the influence of state laters and mapping layers on the summary results.It improves the accuracy of summary sentences and enriches the semantic expression ability of the decoder.The final scheme performs well in Rouge evaluation system.The proposed model of automatic encoder can also be applied in other fields.
Keywords/Search Tags:Generative Automatic Summary, Word Vector, Automatic Encoder, Recurrent Neural Network
PDF Full Text Request
Related items