Font Size: a A A

Research On Paraphrase Identification Based On Neural Network

Posted on:2021-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z C ZhaoFull Text:PDF
GTID:2518306047482144Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The research of paraphrase identification based on neural network refers to the determination of semantic consistency between two paragraphs of text in the framework of neural network.Paraphrase identification is the basis of research on machine translation,dialogue systems,automatic question answering,and plagiarism detection.It is the key technology and core problem of natural language processing.This paper aims to improve the performance of paraphrase identification,and it is carried out in the framework of neural network from the three aspects of semantic interaction on different syntactic structures in the paraphrase identification,interaction of multi-granularity objects in sentences,and text paraphrase identification with partial matching sensitivity.The main research work of this paper includes:(1)In view of the problem that different syntactic structures should have different functions in the existing sentence level paraphrase identification methods,this paper presents MSSIAM(A Model of Syntactic and Semantic Interaction based on Attention Mechanism).Because the syntactic structure does not contain semantic information,words contain semantic information.The model divides the granularity of sentences into the syntactic roles that words play.Through the integration of syntax and semantics,the syntactic structure containing semantic information is obtained,and the interaction between single words and syntactic structure is completed.Then,we use attention mechanism to complete the interaction between syntax and syntax,and solve the problem that different syntactic structures play different roles in the interaction process.The model is tested in the MSRP data set.The model of sentence syntactic and semantic interaction interpretation based on attention mechanism is compared with the model without syntactic weight.The experimental results verify the effectiveness of MSSIAM.(2)This paper aims at the problem that the existing sentence level interpretation methods lack the semantic interaction represented by words,phrases and sentences.In this paper,(MultiGranularity Interactive Paraphrase Identification Model based on Self-Attention Mechanism)MGIPIM-SAM is proposed.The model uses an expanded automatic encoder to get the semantic expression of multi granularity features such as words,phrases and sentences.By introducing self-attention mechanism,the problem of insufficient semantic interaction among multi granularity features such as words,phrases and sentences is solved.The model is tested on the MSRP data set,and the experimental results show that MGIPIM-SAM gets better F-score value.(3)For the existing methods of text level interpretation,partial model matching is not considered,This paper presents PMCNN(Partial Matching Convolution Neural Network),which extracts features by means of continuous convolution and continuous pooling,uses multi-layer perceptron to complete the interaction between features,and solves the problem of partial matching.The model is tested on PAN@2013 and PAN@2014 plagiarism source retrieval datasets.The experimental results show that PMCNN is superior to the baseline method in statistical validity.
Keywords/Search Tags:Paraphrase Identification, Neural Network, Syntax, Attention Mechanism, Partial Semantic Matching
PDF Full Text Request
Related items