Font Size: a A A

A Research On The Recognition Of The “Sensational Headline” News Based On An Improved VSM-HowNet Fusion Similarity Algorithm

Posted on:2019-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhaoFull Text:PDF
GTID:2428330548956871Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the development of society,more and more the "Sensational Headline" phenomena have appeared around us.The "Sensational Headline" made news title that did not match the text for the purpose of attracting the audience's attention,and the news were quickly transmitted and spread.This led to the public's questioning towards the professional quality of journalists,and also caused many social problems.Therefore,the study of the "Sensational Headline" news recognition technology had a realistic guiding significance to improve the network news environment.The "Sensational Headline" news refers to the news on the Internet which were in the process of the transmission,the news publishers used a variety of attractive rhetorical techniques to make a sensational title to attract the attention of the audience.The key problem to be studied in this paper was that the calculation method of text similarity,according to the traditional the "Sensational Headline" news recognition method,we gave a method of text similarity calculation,an Improved VSM Combined with Cosine Similarity Method: the phrases appearing Synonymy Thesaurus represented the words in the text,and we also gave the HowNet Method to calculate the similarity of text similarity.And we also researched an Improved VSM-HowNet Fusion Similarity Algorithm: Used the Improved VSM Combined with Cosine Similarity and the HowNet Method to integrate each other,and then used the Fusion Similarity Algorithm to calculate the similarity of text and to recognize the "Sensational Headline" news and the normal news.According to the phenomenon of the "Sensational Headline" news,we elaborated the causes,harms and the current research extent of the similarities of text similarity calculation.The significance and the main contents of this topic research were also given.We gave the description from the text analysis process,the concept of text similarity and the calculation methods,and we also gave the existing problems and limitations of the traditional text similarity calculation methods and Wang's topic-word form text similarity calculation method.In the research of methodology,we gave an Improved VSM Combined with Cosine Similarity Method,in this method,the traditional word vector was changed to be expressed in the form of a vector of synonyms in the Synonymy Thesaurus.We gave the HowNet text similarity calculation method and the Improved VSM-HowNet Fusion Similarity Algorithm to recognize the "Sensational Headline" news and the normal news,the recognition technology of the "Sensational Headline" news and the normal news would be more perfect and efficient.In the next,according the new text similarity calculation methods in this paper and the limitation of Wang's text similarity calculation,we gave the purposes of the experiment,corpora,contents and the processes according to the existing problems.Finally,according to the specific experimental process,we got the statistics and analysis of the experimental results,according to the comparison of the Fusion Similarity Algorithm and Wang's text similarity calculation method,we got that in the condition of varieties of the ratio of the "Sensational Headline" news and the normal news and the data set,the Fusion Similarity Algorithm was better than Wang's topic-word form text similarity calculation method.After further research,we used the Improved VSM Combined with Cosine Similarity Method to calculate the precision of the "Sensational Headline" news,and the precision rate was 60.7%,it was better than Wang's text similarity calculation method.As for the precision rate,the recall rate and the F1-Measure of the normal news,compared to Wang's text similarity calculation method,it increased by 1.35%,6.71% and 10.02% respectively.We got an Improved VSMHowNet Fusion Similarity Algorithm,the total precision rate,total recall rate and total F1-Measure of this method in the recognition of the "Sensational Headline" news and the normal news were all higher than other text similarity calculation methods,for recognizing the unknown type of news,the Improved VSM-HowNet Fusion Similarity Algorithm was more advantageous than other text similarity calculation methods.Through the research,we get the conclusions that we use the Improved VSM Combined with Cosine Similarity Method: the phrases appearing Synonymy Thesaurus represents the word in the text,and we also give a HowNet Method to calculate the similarity of text similarity,and we compare them with Wang's text similarity calculation method,we get good results.The total precision rate,total recall rate and total F1-Measure of the "Sensational Headline" news and the normal news calculated by the Improved VSM-HowNet Fusion Similarity Algorithm are better than other text similarity calculation methods.
Keywords/Search Tags:"Sensational Headline" News, Improved VSM Combined with Cosine Similarity Method, HowNet Method, Improved VSM-How Net Fusion Similarity Algorithm
PDF Full Text Request
Related items