Font Size: a A A

Design And Implementation Of Clickbait News Detect System Based On Natural Language Processing

Posted on:2022-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:W X LiFull Text:PDF
GTID:2518306563962869Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Under the guidance of the economic development and cultural prosperity,media industry is also in a good situation of development.The development and innovation of the media industry have contributed influence to society.Also,there are some problems behind the prosperity of media development,such as the phenomenon of clickbait.The clickbait news is the chaos of news under the background of the current era.Driven by the interests of consumerism,economism,and the self-media,the phenomenon of clickbait news has developed rapidly.The clickbait news draws the attention of readers with suspenseful and exaggerated headlines,and such exaggerated headlines are seriously out of touch with the news content and cause bad influence.Based on these problems,this paper designs and implements a clickbait news recognition system based on natural language processing technology.In order to accurately identify the clickbait news,it is critical to form clickbait rules and build an accurate natural language processing model.The clickbait recognition system will consider the title and the correlation between the title and the news content.In the demand analysis phrase,we collect the typical characteristics and identification rules of the clickbait.The goals are determined according to the rules.The system will identify the clickbait from two parts,news headline and the relevance between the headline and the news content.By manage news and authors to reduce the release of clickbait news and improve the quality of website.The system will use the Bert model to classify the headline text,identify whether the headline is inductive and untrue.This will identify the clickbait news based on the headline text.By using the HAN model to generate a document vector for the news content,and calculate the similarity with the title vector.Based on the similarity,the news will be is determined to a clickbait new or not.The system conducts real-time detect of newly released news by use Apache Storm for data stream processing,and saves the results to the Mango DB database.Managers access the database and manage it through the website page.Up to now,Audit system have performed well in real-time tasks.It can identify clickbait news in time and has made a huge contribution to improving the quality of news.
Keywords/Search Tags:Clickbait, Natural Language Processing, Text Similarity, Text Classification
PDF Full Text Request
Related items