Font Size: a A A

Automatic Summarization System For Multi-domain Text

Posted on:2022-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y F LvFull Text:PDF
GTID:2518306536963689Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Automatic text summarization is an important task in natural language processing.With the explosive growth of Internet information,there are a wide range requirements automatic summarization to compress critical information from long text.With the development of artificial intelligence technology and the improvement of hardware computing capabilities,automatic text summarization systems have evolved from rule-based methods to feature engineering-based methods to methods based on deep learning in recent years.Among them,the express learning ability of deep learning methods and the ability to support large-scale models make automatic text summarization technology rise to a new stage.However,due to the difficulty of obtaining text corpus,most of the existing work focuses on news texts.Although there is a strong demand for text summarization in scientific papers,social media,dialogue meetings and other domains,it is difficult to effectively use high-performance automatic text summarization systems which trained from news text.Therefore,this thesis focuses on the adaptation and transfer of automatic summarization systems in multi-domain texts.Based on deep learning methods,it explores how to obtain an automatic text summarization system with multi-domain adaptation capabilities and transfer capabilities through few samples.Specifically,this thesis focuses on abstractive summarization task,and explores the domain adaptation and domain transfer problems of multi-domain texts with three possible solutions based on the pre-trained language model.Firstly,because the existing automatic text summarization system lacks domain adaptation capabilities,this thesis proposes a domain-adaptive text summarization model based on multi-task learning,which utilizes the domain generalization ability of the pre-trained language model and the joint learning of unlabeled data to achieve the purpose of expanding model's coverage of different domains.Secondly,this thesis proposes a domain feature mining method that uses the domain spatial distribution distance as the domain feature for the domain lacking unlabeled data,and merges the domain feature with the pre-training language model to obtain a better domain transfer ability.Thirdly,this thesis combines the previous two studies and proposes a model-agnostic automatic text summarization method based on meta-learning.A model generator is obtained through training on unlabeled domain text,and an automatic text summarization model for a specific target domain is generated after fine-tuning with a small set of labeled examples.The experimental results show that the three models or methods proposed in this thesis can improve the automatic summarization performance of abstractive text summarization task on multiple domains,and trained models have better domain adaptation and transfer capabilities.The three methods proposed in this thesis aim to improve the adaptability and transfer capabilities of automatic text summarization systems in multi-domain texts.Aiming at different situations and combining text summarization system and the pre-trained language model,this thesis proposes significant and effective solutions for multi-domain text summarization.At the same time,it also provides new ideas for the future automatic text summarization system in the research of multi-domain texts.
Keywords/Search Tags:Automatic Summarization, Multi-Task Learning, Meta Learning, Domain Adaptation, Domain Transfer
PDF Full Text Request
Related items