Font Size: a A A

Research On Meeting Text-oriented Extractive Summarization

Posted on:2021-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z SunFull Text:PDF
GTID:2428330611998191Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of Internet technology,the exponential explosion of information makes it difficult for people to access information resources efficiently.Therefore,automatic summarization technology has aroused extensive attention.Unlike the common singledocument summary problem,there are two different forms of the meeting text: monologue and dialogue,and the meeting text has the characteristics of information redundancy and content incoherence.Especially in the current situation of the epidemic,the demand for online meetings has exploded,and automatic summarization of meetings have become more and more important.And the extractive summarization are more stable and readable than the abstractive summarization.This paper focuses on the topic of extractive summarization of meeting text,two extractive summarization models of different structures and the fusion model of the two models are proposed,and a novel method of speaker role feature modeling is proposed for dialog meeting text.This paper mainly includes the following three aspects: sequence extraction technology based on multi-task learning,end-to-end probabilistic prediction technique based on reinforcement learning and speaker feature modeling technology based on conditional variational autoencoder.The sequence extraction technology based on multi-task learning uses a sequence extraction model.Considering to take better advantage of the paragraph information of the input text,multi-task learning which based on the sequence extraction model is used to joint training with the target task,to share relevant information and improve the target task.The experimental result shows that the sequence extraction model based on multi-task learning has achieved satisfactory results on the iflytek data set of this topic,which proves the effectiveness of this method.The end-to-end probabilistic prediction technique based on reinforcement learning uses a different model structure from the sequence extraction technique based on multi-task learning,which uses the encoder and decoder structure to perform the extractive summarization.And considering the inconsistency between training objectives and evaluation metric,the end-toend model uses reinforcement learning and takes F1 value as the reward function.The experimental result shows that the end-to-end probabilistic prediction technique based on reinforcement learning has achieved a good effect on the data set of this topic and has played a certain role in the auxiliary training.Finally,model fusion is carried out for two extractive summarization models,and the fusion model achieves the best effect on iflytek data set.The speaker feature modeling technology based on conditional variational autoencoder is proposed for the dialog meeting text.Considering the importance of speaker,this topic uses a speaker feature modeling module based on conditional variational autoencoder to add user feature information to sentences,which can be simply and effectively integrated into the summarization model.Since this module is used for modeling speaker characteristics,rather than independently completing the summary task,in order to verify the effectiveness of this method,this module is added to the extractive summarization model and the abstractive summarization model.The results show that this method has a significant improvement over the baseline method.
Keywords/Search Tags:extractive summarization, meeting text, multi-task learning, model integration, speaker feature modeling
PDF Full Text Request
Related items