| Nowadays,deep learning research has been applied in many fields,among which sentence-level representation information is also used in many downstream tasks of natural language processing.Although methods based on pre-trained language models such as BERT have excellent performance on various downstream tasks,but the shortcomings of Pre-trained Language Models also lead to poor performance on English Semantic Similarity Tasks(STS).Many previous studies have trained good sentence embedding through supervise learning,but labeled data are often difficult to collect.So use unsupervised methods to train models is a significant trend.At present,unsupervised contrastive learning usually use existing methods to construct positive data and augment Batch Size.Taking this as a starting point,this paper studies the problems of positive example construction and batch size expansion methods in the existing unsupervised contrastive learning of natural language processing.Therefore,this paper works are as follows:(1)Based on the unsupervised contrastive learning,we propose a simple,easy-to-operate and effective model SRL-BERT(BERT-based Sentence Representation Learning).SRL-BERT proposes a new text data augmentation method to construct positive training data required for contrastive learning.(2)Previous studies have found that a larger batch size has a positive impact on contrastive learning,but due to hardware limitations,modifying the training parameters of the batch size in engineering applications may cause the problem of memory overflow.Based on the vector fusion method Mixup,we propose a solution to effectively expand the training data of contrastive learning.Under the premise of using lower space resource consumption,the training data can be directly doubled,thereby effectively improving the training performance of SRL-BERT.SRL-BERT will be compared with many baseline models on Chinese and English datasets.The experimental results show that SRL-BERT can achieve good performance.In addition,we also explores the influence of parts of modules and parameters in the SRL-BERT through a series of ablation and comparison experiments. |