Font Size: a A A

The Research And Implementation On Sentence Similarity Based On Deep Neural Networks

Posted on:2018-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:X C XieFull Text:PDF
GTID:2348330533966801Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As one of the core tasks in natural language processing,measuring semantic similarity between two texts plays a vital role in various aspects such as query recommendation,automatic question-answering system and abstract extraction.At present,methods based on sentence similarity basically centre on lexical matching,semantic analysis trees and some structured semantic knowledge which depends on external resources.However,each approach has its limitation: lexical matching is not an ideal way to acquire semantic similarity characteristic;semantic-knowledge-based approach can't be applied universally to all fields;and semantic-analysis-trees approach,as shown by recent research,is only confined to those texts with a good grammatical organization.In recent years,with widely use in extensive fields including image process and speech recognition,deep learning has obtained remarkable achievement.What's more,recent research has indicated that deep learning plays quite well in natural language processing,whereby this paper presents a model to measure similarity between sentences based on Long Short Term Memory?LSTM?and Convolutional Neural Networks?CNN?,combined with some additional features between texts.We represent the input sentences by using the popular word embedding methods,word2 vec and GloVe,extract the pre-and post-dependent relationship and local information by using LSTM and CNN respectively and then combine the additional features between the sentences to measure the sentence similarity.The method based on deep neural network overcomes the lexical gap problem.It allows computer to identify the semantically equivalent sentences with different organization,provides capability of extracting the information of sentences from different perspectives to enrich expression,and finally combined with the additional features of sentences,leading to more accurate result.In order to demonstrate the performance of the sentence similarity model proposed in this paper,we have carried out semantic similarity and semantic correlation experiments on three commonly used public datasets,MSRP,SICK2014 and MSRVID.The experimental results show that,based on LSTM and CNN,and combined with the word embedding and extra features,our model is up to the state of the art with competitive performance on the above datasets,which definitely attests to its high availability and applicability.
Keywords/Search Tags:sentence similarity, LSTM, CNN, extra features
PDF Full Text Request
Related items