Font Size: a A A

Research On Chinese Text Summarization Technology Based On BERT-KA-PGN Model

Posted on:2022-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:S Z ZhangFull Text:PDF
GTID:2518306476490834Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In recent years,the rapid development of the Internet has brought people a lot of information,but also brought the problem of information overload.Therefore,it becomes more and more important to study how to quickly obtain key information from massive information,and automatic text summarization technology is the key area of related research.With the development of deep learning,its application research in the field of text summarization is endless.In this thesis,aiming at the problems of inaccurate semantic representation and insufficient key information in the existing text summarization model,we improve the automatic text summarization technology in the task of Chinese text summarization.The main research work is as follows.In order to solve the problems in automatic text summarization mentioned above,this thesis designs an improved text summarization model based on the model of seq2 seq and attention mechanism,namely,bert-keywords attention-pointer generator network(BERT-KA-PGN).In this model,the bert pretraining language model is added to the network as a word embedding layer to enhance the context understanding of input sentences,so as to obtain more abundant vector information of semantic representation.And the keywords are extracted by keyword extraction algorithm,and then the keywords are integrated into the attention mechanism to make the model pay more attention to the main information in the text in the process of generating the summary,so that the generated summary contains more key information.At the same time,the model combines the structural advantages of the pointer generation network model with pointer network and overlay mechanism to solve the problem of unknown words and duplicate values in the text summary,so as to improve the quality of the generated summary.The experimental results show that,compared with the model of seq2 seq and attention mechanism,the scores of rouge-1,rouge-2 and rouge-l are increased to 34.46%,19.42% and 30.17% respectively on nlpcc2017 dataset,and the corresponding rouge scores are increased to 37.52%,21.78% and 32.34%respectively on flying propeller dataset,which shows the structural advantages of the pointer generation network model;Compared with the pointer generation network model,the corresponding rouge scores of the bert-ka-pgn model were increased to 38.65%,22.43% and 33.51% on the nlpcc2017 dataset,and 42.28%,23.89% and 35.63% on the propeller dataset.In conclusion,adding bert and fusing keyword attention mechanism can improve the generation effect of automatic text summarization model,which has a certain reference value for the research of automatic text summarization.
Keywords/Search Tags:Text summary, Seq2Seq, BERT, Keywords, Attention mechanism
PDF Full Text Request
Related items