Font Size: a A A

Research On The Calculation Method For Semantic Similarity Of Sentence And Its Application

Posted on:2017-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2348330512465064Subject:Control engineering
Abstract/Summary:PDF Full Text Request
Nowadays,with the rapid progress of information technology and the growing popularity of Internet technology,web news,web blog,WeChat public articles have become one of the main ways of people to get information or attention to political affairs.But the “title party” news phenomenon emerge in endlessly for the web site to attract readers.Artificial filtering the news become unrealistic.As a result,people on the computer "the title party" news identify technical requirements become very urgent.The technology is the core content of sentence similarity calculation,The main research content of this paper is aimed at this phenomenon is the sentence semantic similarity calculation method,and the topic sentence extraction algorithm.In this paper,we study object is Chinese sentences,Chinese and other natural languages such as English has its characteristics and study a lot of difficulties,There are many problems such as low accuracy rate,large vocabulary,complex semantics,sentence structure and context.Semantic impact of the fight,the basic grammar unit is difficult to determine and so on.To solve above problems,after reading the literature and selected the Word2 Vec algorithm as the data model training algorithm,the algorithm is very good to solve the above most of the difficulties.It has good learning ability and training efficiency.It can describe the semantics of Chinese vocabulary well and can distinguish different real meanings from the context of vocabulary.In addition,the analysis according to the article research the topic sentence has a high repetition frequency,distribution scope big etc.Characteristics.On the basis of using the sentence semantic similarity computing algorithm and the topic sentence feature optimization topic sentence extraction algorithm.The research work includes the following aspects:(1)Sentence similarity related literatures,the basic concept of sentence similarity were analyzed and summarized in this paper,news data acquisition and preprocessing method;Studied the existing variety of sentence similarity computing method and the topic sentence extraction algorithm,and the advantages and disadvantages of various algorithms are put forward.(2)Thorough Word2 Vec model training algorithm is studied deeply,in order to improve the model results,in the second training model are provided and experiments are conducted,good results are obtained.Lots of news topic sentence characteristics summarized analysis,according to the characteristics of topic sentence extraction algorithm is proposed.(3)The topic sentence extraction model was used to extract topic sentences and carry out the "title party" news identification,with 200 articles as news instance for experiments.Experimental results show that the sentence semantic similarity calculation method and optimization of the topic sentence extraction algorithm in this paper has better performance.(4)The semantic-based sentence similarity calculation method and the "headline party" news forecasting algorithm are applied to the actual scene.The algorithm is implemented systematically,and data is exchanged through WebSocket.Users can self-train models or adopt default models to identify them.Practical application value.
Keywords/Search Tags:semantic similarity, Word2Vec algorithm, topic sentence extraction, title party news
PDF Full Text Request
Related items