Research On Sentence Representation Based On Contrastive Learning And Deep Neural Network

Posted on:2024-03-27

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Q Wang

Full Text:PDF

GTID:1528307325950019

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Sentence representation aims to obtain a vector of fixed dimension to represent a sentence described in natural language.Sentence representation is widely used in natural language processing fields such as sentiment analysis,language translation,intelligent question answering,and text classification.Therefore,the quality of sentence representation directly affects the performance of downstream tasks.However,in traditional sentence representation methods,most of them employ word frequency statistics or directly average the trained word vectors to represent a sentence.These methods cannot evaluate the similarity of two sentences from the semantic perspective,and will fully conclude that the obtained sentence representations are similar to sentences for high overlap rate of word,which limits the practical application of such methods.With the improvement of the ability to model text data,more and more methods can be used to improve the quality of sentence representation.On the one hand,large-scale pre-trained neural networks,represented by Transformer,have powerful modeling ability on text data through multi-head attention mechanism,which makes it possible to extract high-dimensional features of text data.The pre-trained language model can achieve excellent transfer learning performance on downstream task only by fine-tuning on a small number of target data,which provides a key technology for improving the performance of sentence representation based on large-scale pre-trained neural network models.On the other hand,representation learning based on contrast learning has achieved great success in image representation tasks.Only by constructing positive samples and negative samples required for contrast learning and contrastive loss,a model can learn the difference information between positive and negative samples,so that the model has an excellent ability to represent the data.Although the framework of contrastive learning is simple,it uses the siamese network and the loss function to narrow the distance between positive samples and enlarge the distance between positive samples and negative samples,making the ability of unsupervised data representation to match the previous supervised training methods,which provides a technical support for the representation of large but unlabeled natural language texts.Therefore,this dissertation investigates sentence representation based on contrastive learning and deep neural networks.Specifically,the content includes: grouping self-supervised contrastive learning of sentence representation,contrast learning of sentence representation with prompt,and sentence representation of multi-task contrast learning based on generative model.The main contributions of this dissertation are summarized as follows:(1)The existing sentence representation methods use discrete data augmentation strategies to obtain the positive samples needed for contrast learning,which leads to the possible loss of the original semantics of the sentence before and after data augmentation.In order to solve this problem,this dissertation proposes the grouping self-supervised contrastive learning of sentence representation.This method design a continuous and partial data augmentation to obtain the positive samples needed for contrast learning,so that the high original semantics between the positive samples is maintained,which is convenient for contrast training.Secondly,in order to verify the effectiveness of the proposed data enhancement method,a lightweight deep neural network is deliberately selected as an encoder to extract sentence semantic features.After all,large-scale language models have inherently powerful representations ability for data,which cannot investigate the performance of sentence representations based on contrast learning.Finally,in order to solve the problem that the local information of high-dimensional vectors may not be employed effectively by the lightweight model in the computation of contrastive loss,this dissertation divides the obtained sentence features into groups,and then calculates the contrastive loss after grouping,so as to use more local information of sentence features to improve the performance of sentence representation.(2)Considering that most of the existing contrastive learning methods directly use the contrast framework of image representation for training,and ignore the problem that the data difference between text and image limiting the contrast training,this dissertation proposes a contrast learning of sentence representation with prompt.Different from previous studies,this method takes into account the situation that text data is sparse and discrete compared with images.In this dissertation,a novel contrastive framework is designed,and the prompt mechanism is used to guide the unsupervised model to understand the sentence representation task,so as to generate high-quality sentence representation vectors.In this framework,a pre-trained language model is utilized as the backbone network and its word embedding layer uses continuous and partial data augmentation to obtain positive samples for contrast learning to obtain sentence representation vectors.The experimental results show that the proposed sentence representation based on deep neural network model and contrast learning achieves the best performance at that time.In addition,in order to verify the performance of supervised contrastive learning and deep neural network models on sentence representation,this dissertation proposes a sentence representation method based on supervised contrastive learning,and obtains encouraging results.(3)It is observed that the existing sentence representation methods use Transformer Encoder as the backbone network to extract sentence features,but the distribution difference of Encoder’s representation of high and low frequency words leads to the problem that the sentence representation model is difficult to effectively extract sentence semantic information.In this dissertation,we propose a sentence representation of multi-task contrast learning based on generative model.Instead of directly using Transformer Encoder to obtain the sentence representation vector,this method generates the sentence representation by the decoder of generation model.In order to improve the consistency of semantic information between words and sentences,this dissertation designs a multi-task learning to make the proposed model obtain more inter-sample semantics,and utilizes generation tasks to fill the semantic consistency between words,so as to improve the performance of sentence representation.Note that the proposed method employs deep generative neural networks to obtain sentence representation and that the positive samples used for contrastive training from natural language inference datasets.The experimental results show that the generative sentence representation based on multi-task contrastive learning can avoid the difference in the representation of high and low frequency words in the pre-trained model Encoder,and enhance the semantic consistency between words through multi-tasks to improve the quality of sentence representation.

Keywords/Search Tags:

sentence representation, sentence embedding, data augmentation, contrastive learning, multi-task learning, representation learning, unsupervised learning

PDF Full Text Request

Related items

1	Research On Sentence Representation Learning With Ternary Equilibrium
2	An Unsupervised Sentence Representation Training Method
3	Research On Word And Sentence Representation Key Technologies Based On Deep Learning Approaches
4	Sentence Embedding Representation With Syntactic Information Learning Method And Application Research
5	Sentence Representation Learning Based On Redundant Information Filtering Of Pre-training Data
6	Researches On Graph Representation Learning Based On Contrastive Learning
7	Improved Sentence Embedding Based On BERT And Prompt-learning
8	Low-resource Speech Representation Learning And Its Applications
9	Unsupervised Sentence Embedding With Prompt Learning And Sample Filter
10	Two Direction Machine Translation Based On Sentence Semantic Embedding And Its Evaluation