A Sentence Representation Method Based On Syntax And Semantic

Posted on:2020-03-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y Q Le

Full Text:PDF

GTID:2428330620451114

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Sentence similarity modeling lies at the core of many natural language processing applications,and thus has received much attention.Measuring sentence similarity is challenging due to the ambiguity and variability of linguistic expression,and thus has received much attention in recent years.A large number of prior works focused on feature engineering,and several types of sparse features have been shown to be useful.Recently,owing to the success of word embeddings,researchers have attempted to study sentence similarity modeling via sentence embeddings.Most of them focused on learning semantic information and modeling it as a continuous vector,yet the syntactic information of sentences has not been fully exploited.On the other hand,prior works have shown the benefits of structured trees that include syntactic information,while few methods in this branch utilized the advantages of word embeddings and another powerful technique �attention weight mechanism.Inspired by the above observations,this thesis attempts to absorb the advantages of the above mentioned techniques,and develop a more efficient method.In a nutshell,this thesis proposes the ACV-tree model,which uses a structured manner for sentence similarity modeling.It seamlessly integrates semantic information,syntactic information,and the attention weight mechanism.To measure similarity,this thesis develops a new tree kernel,known as the ACVT kernel,that is tailored for this thesis proposed structure and is designed for high operability.ACV-tree model can be used as a general framework,since one can view word embedding and attention weight as the building blocks of the framework,allowing users to replace them using other off-the-shelf(or more powerful,developed in the future)word embedding techniques and attention weight schemes.Besides,unlike most of sentence embedding-based models,ACV-tree model can be free from time-consuming learning/training,once word embeddings are available.On the other hand,there are also word embedding based models for sentence similarity modeling.Nevertheless,ACV-tree model can achieve better performance on almost all datasets used in our experiments,compared against the word-embedding based models.In order to verify the effect of the proposed model,this thesis conducts experiments on19 datasets,which are derived from the Semantic Textual Similarity(STS)task of the International Semantic Evaluation SemEeval Competition.Each dataset contains many pairs of sentences.These datasets cover a wide range of domains such as news,web forum,images,twitter.The experimental results,based on 19 widely-used STS datasets,demonstrate that our model is effective and competitive,compared against state-of-the-art models.Additionally,this thesis studies the universality of ACV-tree model by using various attention weight mechanisms and word embedding techniques.Specifically,this thesis has explored many variant methods based on word embedding,attention weighting mechanism,and syntax.The experimental results validate that many attention weight mechanisms and word embedding techniques can be seamlessly integrated into ACV-tree model,demonstrating the robustness and universality of our model.

Keywords/Search Tags:

Natural language processing, Sentence similarity, Tree kernel, Sentence embedding, Attention weight mechanism

PDF Full Text Request

Related items

1	Sentence Similarity Computing Based On Semantic Tree Kernel
2	Sentence-embedding And Similarity Via Hybrid Bidirectional-LSTM And CNN Utilizing Weighted-pooling Attention
3	The Research Of Chinese Sentence Similarity Based On Layered
4	Computational Methods Of Sentence Distance Based On Multi-modal Word Embedding
5	Research On Chinese Sentence Similarity Calculation Based On Deep Learning
6	Research Of Simple Sentence Similarity In Uyghur Language Teaching Materials Used In Primary School
7	Research On Calculation Method Of Chinese-Thai Cross-language Sentence Similarity Based On Word Embedding
8	Unsupervised Extractive Text Summarization Using Sentence Embedding
9	Research Of A Kazakh Sentence Similarity Computing
10	Study And Application On Chinese Sentence Similarity Computation