Font Size: a A A

Research Of Title Party News Identification Technology Based On Latent Semantic Analysis

Posted on:2016-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:J LuoFull Text:PDF
GTID:2298330479950310Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the mobile internet in recent years, there are more and more media organizations,journalists,governmental administrators,and even the ordinary people can release and distribute news in the Internet.Due to the global,open,sharing and free natures of media communication via the Internet,there are more and more title party news,which are spread rapidly,that are intended to get more page hits and exposure in the Internet.The tile party phenomenon not only has stoked social tensions and confusing situation,but also leaded to the problem that the public doubts about the professional quality of news publishers and the abilities of the government and organization.Therefore,research of title party news are still of guiding significance for improving quality of news.Title party news means that news publishers create a noticeable news title in order to catch the reader’s eye and increase click-through rates by using kinds of rhetorical figure such as exaggeration,eroticism,in-authenticity,illogicality and so on.To prevent the title party phenomenon,this paper proposes the title party news identification technology based on latent semantic analysis(LSA),and this method has been implemented.This paper chooses the news page as the identify object from mainstream news website both in China and abroad.This paper hopes that this title party news identification technology based on LSA can filter out the title party news and get the optimal identification result.Besides,this paper expects this method,which aims at improving the reading quality of news, can identify the title party new quickly and efficiently,and prevent the spread of title party news to some extend.Firstly,this paper present the status of title party phenomenon at home and abroad, and introduce the research situation of title party news identification technology in China,at the same time,this paper points to the problems and restrictions of it.And on this basis this paper proposes the research significance and main content of this subject.Secondly,this paper introduces and makes a summing up of the knowledge including the principle of web page elimination,vector space model(SVM) and singular value decomposition(SVD) closely associated with the title party news identification technology.Thirdly,this paper proposes the system of title party news identification based on LSA,and then describes the relevant technologies.These technologies involve news page download technology based on hyper text transport protocol(HTTP),the technology of content extraction from web pages based on distribution of line-block algorithm,word segmentation based on forward maximum matching method,the calculation of VSM,the approximate matrix-calculation based on SVD and title party news identification based on LSA.Besides,this paper describes detailed design each of modules and its working principles.Finally,this paper has proved the feasibility and validity of title party news identification based on LSA by analyzing the experimental result data.
Keywords/Search Tags:title party news, latent semantic analysis, singular value decomposition, distribution of line-block algorithm
PDF Full Text Request
Related items