Font Size: a A A

Research On Text Recommendation And Abstract Summarization Based On Topic Modeling

Posted on:2022-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:J W WuFull Text:PDF
GTID:2518306602967869Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the continuous development of Internet-related technologies,the network has become an indispensable bridge between people and the world.A great amount of text data is generated in the network every day,and how to get the text information that users like from a large amount of network text information has become an important issue.At the same time,with the accelerating pace of modern people's life,it is difficult for most people to spend a lot of time reading long reports and articles.Therefore,extracting the main information from the articles to generate abstracts can facilitate the selection of articles to be read.Text recommendation and text abstract summarization are common natural language processing tasks.Text recommendation selects articles with similar content according to the similarity between the input query and the articles.To generate text abstract summarization,we need to analyze the relationship of words in the text,and extract the core information of the long text for display.But common text recommendation and abstract summarization models only use the information of characters and time series on the surface of text,and can not get the deeper features of the topic.At the same time,this kind of model also can cause common sense problem because there is no introduction of external knowledge.Therefore,this paper will make improvements in text recommendation model and text abstract summarization model respectively,carry out relevant research in topic feature,common knowledge and background support,and finish the corresponding text analysis system.First,there is a connection between semantic information and character information,and extracting better text features can effectively improve the final effect of downstream tasks.Based on the information of the original text,this paper extracts the hidden topic features of the text by using the probabilistic topic model,embeds the knowledge map to obtain the common knowledge,and supplements the original text information according to the topic features and common knowledge,therefore,topic information and common knowledge are introduced into the text recommendation task.At the same time,the text topic analysis can also carry on the visualization analysis to the topic weight in the text data,thus displaying the key words of each topic in the text.Finally,the experiment proves that the improved algorithm of knowledge graph embedding and topic feature can get more accurate text similarity and achieve better text recommendation effect.Secondly,the text abstract summarization is divided into extractive summarization and abstractive summarization according to the different ways of producing.This paper discusses the abstractive summarization model,and analyzes the principle,advantages and disadvantages of the common models.After that,we propose the Algorithm of feature correction based on topic attention,and the text abstract summarization is modified by introducing topic feature into the model of summarization generation.Finally,the effectiveness of the algorithm is verified by experiments.Thirdly,the probability topic model needs repeated probability distribution sampling operations in training and testing,so the sampling speed greatly affects the training and testing speed of the model.Based on the multi-thread acceleration algorithm of GPU,this paper builds a probabilistic topic model toolkit,which accelerates the sampling algorithm of common probability distribution in parallel at the bottom of the system,and also arranges and optimizes the common probabilistic topic models,it provides the corresponding tool support for the study and research of the probabilistic topic model.The sampling speed of the toolkit has some advantages in the same kind of toolkits.
Keywords/Search Tags:text similarity, text abstract, text recommendation, topic model, knowledge graph, GPU multithreading acceleration
PDF Full Text Request
Related items