Font Size: a A A

Research Of Hybrid Text Summarization User Dynamic Interest Model Technology Based On Deep Learning

Posted on:2022-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y MaFull Text:PDF
GTID:2518306776952639Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,the volume of text data is also increasing rapidly.The emergence of a large amount of text data makes it difficult for both human and computer to quickly acquire and process the main information in text.Therefore,how to extract the core content from the massive text information quickly and effectively through mathematical theory and technical means,and refine the core content into high-quality summaries has become an urgent problem to be solved.This paper takes the task of text summary generation as the main research object and tries to find a technical method to generate high-quality summaries.In recent years,all kinds of abstract generation techniques are more and more mature,and the quality of abstract generation is better than ever before.However,there are still many obvious shortcomings in various summary generation techniques.To be specific,the coarse processing granularity of extractive-summarization technology will lead to the existence of redundant text in the results,and the coherence between text units can not be well guaranteed.Abstractive-summarization techniques have relatively low semantic relevance and often require a large amount of training data.Therefore,the quality of training data greatly affects the peak performance of the model.In addition,the training process of abstractive-summarization technology is generally time-consuming,and some heavy models require higher computing resources,which is not conducive to promotion.Hybrid technology generation technology is too complex or includes lots of manual intervention,which makes this type of method difficult to adapt to the scene of large amount of data or online scene requiring fast response.Aiming at the above problems,this paper proposes a hybrid text summarization method,which mainly makes the following work from two aspects of language model and summarization-technology:1.A scheme of mixed-granularity language model is proposed.This mixed-granularity language model is designed from the perspective of three different granularity(words,sentences,discourse),based on BERT and graph attention neural network(GAT),and focuses on capturing and integrating the potential features of the three granularities to assist the subsequent summarization task.This model is a sentence-level representation model,which can be used in various NLP tasks to obtain dense embedded sentence-level representation.2.A scheme of hybrid text summarization method is proposed.The type of this kind of technology is a combination of the characteristics of extractive-summarization technology and the abstractive-summarization technology.It preliminary using extractive technology to ensure that the core content of the coverage,which makes it obtain a good ROUGE-1performance(over 45)on two different public datasets.It also makes full use of the extractive results for rewriting and the encoder pre-training,in order to improve the fluency of the paper.The advantage of fluency brings a first-class performance of ROUGE-L(over 40)on two different public datasets.3.A scheme of contrastive learning task about semantics is proposed.The design idea of this task is based on a mixed-granularity language model and contrastive learning.The contrastive task is designed based on the principle of "narrowing semantics and alienating redundancy",so as to improve the quality of abstract generation.
Keywords/Search Tags:Deep learning, Text summarization, Pre-training, Transformer, Graph attention neural network
PDF Full Text Request
Related items