Font Size: a A A

Research And Application On Text Semantic Similarity Computation Algorithm

Posted on:2018-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:M YangFull Text:PDF
GTID:2348330542465277Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Text similarity computation is one of the most important core issues in the Natural Language Processing(NLP).Text semantic similarity computation is based on text similarity computation and semantic analysis.It has broad application prospects.The dissertation proposed two methods based on structural features and neural network,according to research on text semantic similarity.Then,it applied text semantic similarity computing algorithm to a QA(questioning answering)system,and achieved satisfying results.The main contents are as follows:(1)Study on Text Similarity Computation Approach based on Structural RepresentationsMost of text semantic similarity computation approaches used plain similarity features to represent the similarity of a text in the sentence-level.Few approach used structural features.This dissertation introduced to apply structural features to represent the information of syntax and grammar of a sentence in the text.The structural features were found from the Phrase-based Shallow Tree(PST)and Phrased-based Dependenc y Tree(PDT),which are based upon shallow syntax tree and dependency tree.Then it proposed a novel method which combined the structural features and plain features by using support vector regression model to compute the text semantic similarity.The experimental results showed that the novel method achieved a higher performance tha n that only using plain features.The Pearson correlation coefficient is increased by 0.054 and 0.041,by using PST and PDT respectively.(2)Study on Textual Similarity Computation based on Tree-LSTMIn order to improve the performance of text semantic similarity for a long text,it proposed a novel algorithm to text semantic similarity by using deep learning approach.This dissertation put forward the structural features based on the New Phrase-based Shallow Tree(NPST)and New Phrase-based Dependency Tree(NPDT)to suit for neural network model firstly.Then,it combines the above two structural features with different Tree-LSTM(Tree-Structured Long Short-Term Memory)models(i.e.,Child-Sum Tree-LSTM and N-ray Tree-LSTM)to study on semantic textual similarity computation performance.Experimental results showed that new approach achieves a higher performance than previous method for long text.The Pearson correlation coefficient is increased by 0.012 and 0.053,by using NPST and NPDT respectively.(3)Study on Question Answering System based on Textual Similarity ComputationIn this dissertation,text semantic similarity computation approaches are applied to a QA system which is used to process the content of the user's query and rank the search results.This system can increase the correct rate for return answer,and reduce the labor cost.
Keywords/Search Tags:Text Semantic Similarity Computation, Structures Representation, SVR, Tree-LSTM, QA system
PDF Full Text Request
Related items