Font Size: a A A

Research And Implementation Of Automatic Web Novel Summary Based On Deep Learning

Posted on:2024-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:P L GaoFull Text:PDF
GTID:2555307055498054Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The core idea of automatic summarization is to focus on the core content of the text and extract the key information of the text through specific computer technology.The current automatic text summarization technology can be approached from two perspectives: extractive and generative.The former extracts a certain core sentence from the text as a summary and is mainly used in long texts.The latter mimics manual summarization,generating summaries in a "reading comprehension" manner,usually for short texts.However,as an emerging literary category in contemporary literature,online novels have lengthy plots,and their content and article structure are more random and illogical than those of traditional literature.Based on this,this paper proposes an extractive web novel summary incorporating multidimensional features,and proposes a secondary summary of generative web novels for the shortcomings of the extractive summary.At the same time,this paper also designs and implements an automatic summary system for web novels.The specific content includes the following three parts:(1)Extractive summarization model with multi-feature fusionThis paper proposes a multi-feature fusion summary model to improve the problems of traditional extractive summary algorithms in online novels.The model calculates similarity using edit distance and sentence vectors,performs iterative calculations through graph models,and outputs sentences with high scores while integrating feature dimensions such as keyword information,clue sentences,chapter name similarity,and length.The MMR algorithm is used to remove redundant summaries.The results of the multi-feature ablation experiment showed that keyword information has a greater impact on the quality of summary extraction.Therefore,this paper also improves the keyword extraction algorithm by using the Text Rank algorithm,TF-IDF algorithm,and LDA algorithm to jointly extract and vote for keywords,enhancing the keyword information feature.(2)Deep learning web novel secondary abstractionThis paper combines the advantages of generative summarization and pre-training models,and select and experimentally compare four current mainstream Transformerbased pre-training models: BART,Long Former,PEGASUS,and T5-Pegasus on the primary summary that has the core sentences of the article but poor readability and coherence.Generative secondary summarization is performed on these deep learning models to improve the conciseness and readability of the summaries.(3)Design and implementation of automatic web novel summarization systemBased on the primary summarization algorithm of multidimensional features of web novels and the secondary summarization algorithm of deep learning,an interactive web novel automatic summarization system is designed and implemented using Java Script,Python language and Flask framework,and the compatibility,stability and feasibility of the system are verified through system testing.
Keywords/Search Tags:automatic summary, web novels, feature fusion, secondary summary, deep learning
PDF Full Text Request
Related items