Font Size: a A A

Research On Text Summarization Method Based On Deep Learning

Posted on:2023-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:X Y TangFull Text:PDF
GTID:2568306836469344Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,automatic text summarization has become one of the important research directions in the field of natural language processing.The task of summarization aims to convert long document into short abstract containing only key information.Although with the development of deep learning technology,the model has strong representation ability.There are still many problems such as lacking of modeling the selection and sorting in extractive summarization task;lacking of modeling of key information and difficulty in generating out-of-vocabulary words in abstractive summarization task.Therefore,aiming at these problems,this thesis studies and improves the existing text summarization technology.The main research contents are summarized as follows:(1)This thesis proposes an extractive text summarization model based on semantic matching.The method uses multiple key sentences in the document to form candidates,and uses them as extraction units.Using deep learning technology to build a semantic matching network,calculate the similarity between candidates and document,and make the model extract the candidate that best match the document on semantics.The experimental results not only show the rationality of candidates as extraction units,but also verify that the method can extract abstracts with semantics and topic similar to the document.(2)This thesis proposes an abstractive text summarization model based on key information mask and copy.This method uses the information extraction algorithm to extract the key information of the document.By improving the existing mask language model and copy mechanism,the key information in the document is modeled,and the deep learning technology is used to build an abstractive model based on the BERT+Seq2seq.The experimental results show that this method not only enables the BERT to obtain the ability to generate abstract,but also can copy key information and continuous sequences in the document.(3)This thesis proposes a paragraph-level text summarization model based on maximal marginal relevance.In order to improve the ability of the abstractive model to process long document,the method combines the extractive model with the abstractive model to build a paragraph-level summarization model.Using the paragraph maximal marginal relevance to eliminate redundant information between segmented abstracts,the input source of the abstractive model is simplified.The experimental results verify that the method can effectively handle long text summarization task.
Keywords/Search Tags:Text summarization, Deep Learning, BERT, Natural Language Processing
PDF Full Text Request
Related items