Research On Computing Method Of Chinese Sentence Similarity Based On Deep Learning

Posted on:2020-07-02

Degree:Master

Type:Thesis

Country:China

Candidate:H Yang

Full Text:PDF

GTID:2518306464995529

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Chinese sentence feature extraction and similarity computation is one of the research hotspots of natural language processing.At present,the sentence semantics can't be considered comprehensively by the sentence similarity calculation method,which leads to the result of the similarity calculation is not accurate enough.Therefore,this paper proposes a Chinese sentence feature extraction and similarity calculation method based on Sentence representation,semantic feature extraction method and regularization parameter selection,which are mainly divided into the following aspects:For sentence representation,a Chinese sentence similarity calculation method based on deep automatic encoder is proposed.In this paper,we propose semantic feature extraction of sentence and similarity calculation algorithm based on deep sparse automatic coder.Sentences was expressed as high-dimensional and sparse vectors;Then we used deep learning to study Non-linear characteristics of sentences.The high-dimensional and sparse vectors were transformed into Low-dimensional,nature feature vectors.This process was a more pure end-to-end learning to avoid the establishment of stop word list.Ultimately,the low-dimensional feature were used directly for sentence similarity calculation.In order to solve the problem that the manual training of regularization parameters in the experiment process leads to long training time of the model,this method is put forward to calculate the value of the regularization parameter applied to L2 regularization by using the concept of the first order origin moment,the two order origin moment,the variance and the maximum likelihood estimation.This method based on the X matrix of data sets to compute four value.In Neural Network handwritten digit recognition experiments,this method compared with the Bayesian regularization method improve the correct rate about1.14-1.50 percentage points in coursera data set and 0.11-0.75 percentage points in the MNIST data set.Therefore,the method in this paper makes the algorithm more efficient.This method is validity.The experimental results show that the algorithm used to extract the sentence features to calculate sentence similarity compared to sentence similarity computing based on relation vector model and Jacard text similarity algorithm based on word embedding improved the accuracy of similarity calculation.The computational time complexity is only O(n).The regularization parameters of the L2 regularization method in the sentence similarity experiment process are derived from the generalization of the second-order origin moment concept.

Keywords/Search Tags:

deep learning, semantic feature extraction, similarity calculation, L2 regularization

PDF Full Text Request

Related items

1	Semantic Based Similarity Analysis Of Human Video
2	Research On Music Similarity Calculation Method Based On Deep Learning
3	Research On Text Similarity Calculation Method And Its Application In Financial Field
4	Research On Semantic Similarity Calculation Of Chinese Short Text Based On Deep Learning
5	Research On Calculation Method Of Text Similarity Based On Deep Learning In Intelligent Question Answering System
6	Research On Movie Similarity Calculation Using Multi-feature Method
7	The Research On Deep Web Database Based On Semantic Similarity Calculation
8	Research On Chinese Sentence Similarity Calculation Based On Deep Learning
9	A Conceptual Query Based Multi-Document Summarization In Biomedical Domain
10	Sentence Semantic Similarity Learning Based On Deep Learning