Font Size: a A A

Research On Automatic Summarization Based On DPPs Sampling And Autoencodes

Posted on:2021-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y HuangFull Text:PDF
GTID:2518306308969269Subject:Intelligent Science and Technology
Abstract/Summary:PDF Full Text Request
Information explosion following mobile Internet in the production and life requires methods for reading,which are distinguished from the past decades.For us city dwellers living in the hustle and bustle,automatic summarization,which offers technical support to leap from information age to intelligence era,never fails to serve as a function to gain an easy access to condensing information retrieval.The core procedure for automatic summarization is to optimize document representation and sampling algorithm.However,previous methods basically depend on statistical features designed by human comprehension to measure the contribution of each sentence.Also subjective judgement exists within the golden summaries,which are utilized for evaluating.Moreover,statistical features remains challenging to be applied in the multilingual environment due to the lack of prior knowledge.Inspired by DPPs,we construct the sampling method to simulate the process of extractive summarization,and acquire abstract with high quality and low redundancy driven by texts themselves.On the one hand,statistical features are adopted to model the qualities and similarities between sentences;on the other hand,we try neural network language model and autoencoders to directly represent the document instead.Therefore,in this thesis,we propose two models based on DPPs sampilng.We conduct experiments on BIRNDL dataset for statistical model:the experiment shows our model is capable of finding the best linear combinations of qualities for respective golden summaries,and exploring an appropriate way to measure diversity through comparison of WMD and JACCARD.Besides,we adopt neural network language model on MMS-2015 and MSS-2015 datasets.We fuse both unsupervised method DPPs and autoencoders to to overcome the semantic weakness of shallow statistical features in multilingual environment.According to ROUGE evaluation,the experimental results have demonstrated the validity and generality of our algorithm for a variety of languages.Moreover,we also design an automatic summarization system for users to look up the experimental results and gain expected summaries of input texts.
Keywords/Search Tags:automatic summarization, DPPs, neural network language model, autoencoders
PDF Full Text Request
Related items