Font Size: a A A

Research On Paraphrase Identification Method Based On Deep Semantic Understanding

Posted on:2023-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:C C KongFull Text:PDF
GTID:2558306848955199Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Paraphrase identification aims to identify whether two sentences with different expressions have the same semantics.It is the core technology to deal with the synonymous phenomenon of natural language that has been widely used in text summarization,machine translation,and automatic question answering.The key to the paraphrase identification method based on deep neural network is the semantic representation of two sentences.How to make full use of syntactic structure information to enhance the semantic representation of sentences has become a new challenge.Facing this challenge,we focus on the deep understanding of semantics between two sentences and the utilization of syntactic structure information in paraphrase identification.We deeply study the cross-sentence semantic representation learning method based on syntactic structure and the paraphrase identification model.The contributions are summarized as follows.(1)Design and implement a deep interactive paraphrase identification model based on syntactic structure.In the sentence representation learning of a single sentence,we design Tree-LSTM to calculate the semantic representation of each node from the bottom up according to the syntactic structure tree.In the semantic representation learning of cross-sentence,we design an interactive module composed of cross-sentence attention and self-attention.The former is used to learn the features of all nodes of another sentence,and the latter is used to strengthen the learned interaction features,and realize deep interactive learning through multi-layer interactive modules.Finally,the final sentence representation is obtained by applying average pooling and max pooling methods to all nodes.Then,we design a multi-layer perceptron to realize the paraphrase identification of sentence pairs.The experimental results on the public dataset Quora show that compared with the baseline model,the accuracy rate of our model has been increased by0.24% to 89.54%.(2)Propose a cross-sentence representation enhancement method based on semantic matching of structural tree.Since the sentence matching of two sentences is closely related to the matching of their constituent components,we consider to take the semantic matching at the constituents as a kind of feature to be integrated into the sentence representation.By designing a multi-perspective semantic matching function and a learnable weight matrix,we obtain the semantic relevance of two sentences,and use the feature to enhance the current sentence representation.The experimental results on the public dataset Quora show that compared with the baseline model,the accuracy rate of our method has been increased by 0.32%,achieving at 89.72%,which is 0.18% higher than the model in study(1).
Keywords/Search Tags:Natural language processing, Paraphrase identification, Semantic representation, Syntactic structure, Deep interaction
PDF Full Text Request
Related items