Sentence Embedding Representation Learning Research Of Synthetic Syntactic Structure Information In Siamese Network Encoder

Posted on:2020-11-16

Degree:Master

Type:Thesis

Country:China

Candidate:C Jiang

Full Text:PDF

GTID:2428330590452623

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Natural language processing research is the core problem in the field of artificial intelligence.The essential to solving this problem is how to make the machine correctly parse the semantics of natural language and obtain some form of semantic representation.Since the sentences in natural language are the main units that carry semantic information,accurate interpretation of sentence semantics is the key to realizing the task of natural language semantic understanding.Given that distributed word embedding has been successfully applied in natural language processing tasks such as machine translation and automatic summarization,it is natural to think of extending distributed representations to long texts such as sentences,paragraphs,or texts,that is,sentences,The semantic representation of a paragraph or chapter is mapped to a low-dimensional contiguous space.Sentences,which are formed by the coupling of words according to syntactic structure,are important linguistic units that constitute paragraphs and chapters.The existing sentence semantic embedding representation method is mainly based on the word embedding representation in the sentence to perform weighting or summation averaging,ignoring the word order and syntactic structure information,so the learned sentence embedding representation is not accurate.This paper mainly proposes two methods for the current sentence embedding expression learning because of the lack of syntactic structure information and the long distance dependence caused by the long sentence length,which leads to the inaccuracy of sentence embedding.The first method is a sentence-embedded representation learning method based on syntactically structured features,in order to reduce the time spent on parameter training and the influence of non-important information in sentences on sentence semantics.When constructing a syntactic structure tree,perform corresponding pruning operations on some complex syntax trees and convert syntactic information into weight calculations.Another method proposes the weight fusion of the word vector with the syntactic information(part of speech,phrase,clause),and highlights the semantic information of the words in different sentence structures.And the fused vector is encoded by the siamese LSTM network,which effectively solves the problem of long-distance dependence caused by the excessive length of the sentence.Compared with the existing supervised and unsupervised learning algorithms,the proposed method improves the accuracy of sentence similarity calculation.

Keywords/Search Tags:

sentence embedding, syntactic parsing, siamese network, sentence similarity

PDF Full Text Request

Related items

1	Sentence Embedding Representation With Syntactic Information Learning Method And Application Research
2	Sentence-embedding And Similarity Via Hybrid Bidirectional-LSTM And CNN Utilizing Weighted-pooling Attention
3	A Sentence Representation Method Based On Syntax And Semantic
4	The Design And Implementation Of Multi-features Combination In Sentence Similarity Computation
5	The Research And Implementation Of "From-Bottom-to-top" Syntactic Parsing Of Traditional Mongolian Simple Sentence
6	Research On Dialog Generation Method Based On Sentence Similarity And Syntactic Structure
7	Research On Calculation Method Of Chinese-Thai Cross-language Sentence Similarity Based On Word Embedding
8	Research On Automatic Answering Technique Of English Test
9	Thai Sentence Similarity Calculation Research
10	Study And Application Of Chinese Sentence Structure Clustering