The Optimization Of Extractive Text Summarization Based On Pretrained Language Model

Posted on:2022-09-12

Degree:Master

Type:Thesis

Country:China

Candidate:H F Guo

Full Text:PDF

GTID:2518306572997559

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Pretraining language model is widely used in various natural language processing tasks.In this paper,Ro BERTa is migrated to extractive text summarization task,and four optimization methods are used to improve the quality of the extractive abstracts.The hierarchical encoder mechanism is proposed to solve the problem of text truncation.The hierarchical encoder mechanism consists of a sentence-level encoder,Ro BERTa,and a document-level encoder,Transformer encoder,which keeps more text information while providing higher-level information integration.In order to make better use of the text relationships in the document,the discourse graphs are built based on the coreferences,and the graph convolutional neural network is used to update the graph node information.The two-stage approaches are applied in this paper.The extract-then-match approach can dynamically determine the number of sentences,and optimize the sentence combination by matching candidate summaries.The matching model is a siamese network based on Ro BERTa,which maps the original text,reference summary,and candidate summaries to the same semantic space and the candidate summary closer to the original are selected.The extract-then-rewrite approach is used to reduce the redundancy of the extractive summary.The abstractor is the Transformer that the encoder is replaced by Ro BERTa.The rewriting operation produces the abstractive summary from the extractive one through the abstractor.Finally,comparative experiments are conducted on the CNN/Daily Mail data set.The experimental results and analysis show that the above four improvements to the extractive text summarization method solve the corresponding problems and further improve the ROUGE score.The summary generated by the models matches the original text better.

Keywords/Search Tags:

Pretrained language model, Text summarization, Hierarchical encoder mechanism, Graph neural network, Two-stage approach

PDF Full Text Request

Related items

1	Research On Key Techniques Of Two Phase Automatic Summarization Algorithm For Long Text
2	Research On Text Summarization Generation Technology Based On Neural Network
3	Research On Text Summarization Technology Based On Improved Transformer Model
4	Researches On Deep Learning Based Hierarchical Multi-label Text Classification Algorithms
5	Research Of Abstractive Summarization Via Graph-based Neural Network Model
6	Study On Text Summarization Based On Neural Network Model With Selective Mechanism
7	Research On Deep Neural Networks Based Automatic Text Summarization
8	Research Of Hybrid Text Summarization User Dynamic Interest Model Technology Based On Deep Learning
9	Research And Implementation Of Text Summarization Technology Based On Semantic Understanding
10	Chinese Text Summarization Technology Based On Improved BERT Pre-training Model And Graph Neural Network