Font Size: a A A

Research And Application Of Hot News Events Brief Generation

Posted on:2022-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:R F LiuFull Text:PDF
GTID:2518306524490394Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the arrival of the new media era,more and more media appear from the public view,and the amount of daily news reports is also increasing each day.How to help readers quickly understand hot news events is a research topic with huge value.At present,mainstream media basically uses unstructured data such as text and video to narrate news events.At the same time,the research hotspots of news platforms are mainly intelligent recommendations based on user interests and current hotspots.No matter the search results or the recommended page,the problem of information fragmentation and nonintuition is prominent,and the core information of news is not refined and structured.Therefore,it is essential to efficiently deliver the core information of hot news to users.Based on such problem,this thesis proposes a “hot news briefing” solution,the briefings are mainly composed of two parts.The first part is the core event entity extraction,including the core person,location,and organization.The second part is timeline summary.Shows the development overview of news events at various time nodes in a visual timeline.The main research work is as follows:(1)This thesis proposes an effective solution for the discovery of hot news events.The overall solution can be divided into two steps.The first step is called the topic discovery stage.This stage first uses text summarization technology to compress long news,Under each category,clustering technology is used to extract topic clusters and topic model algorithm is used to extract topic keywords as topic tags.The second step is the hotness calculation stage.Use indicators such as likes,dislikes,reposts,and page views of a single news to calculate its popularity,and finally weighted sum to get the total popularity of the topic and rank it to get the hot topics.(2)Contract to original news timeline summary algorithm based on submodular optimization algorithm,this thesis proposes the following three improvements: 1)Add key entity recognition algorithm before origin part,which will pre-filter news sentences that do not contain important entities,which improves the efficiency and accuracy of the model 2)Propose an adaptive algorithm for the selection ratio of important entities,and perform adaptive trade-off on the accuracy of entity selection and the retention of information,which has a greater effect than the fixed ratio algorithm;3)Proposes a rich The Siamese-BERT vector representation including semantic information is more effective than the original i Df(inverse-date-frequency)method based on word frequency.(3)Designed and implemented Hot News Briefing System(HNBS),analyzed the demands and functional modules of HNBS,and completed the design work of technology selection,system architecture,data structure,internal and external interfaces,etc.Finally,I've developed and implemented main functional modules of the system.
Keywords/Search Tags:Natural Language Processing, BERT, Name Entity Recognition, Timeline Summarization, Submodular function
PDF Full Text Request
Related items