Font Size: a A A

Research On Text Matching Methods Based On Multiple Granularity And Siamese Interaction

Posted on:2024-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:G Y XiFull Text:PDF
GTID:2568307157983309Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The main tasks of the Chinese text matching task are to mine the deep semantic information within the text,to explore the semantic similarities and differences between different texts,and then analyze the semantic similarity between two texts to be matched.Among the related research work,there are also problems such as insufficient extraction of potential semantics within text,weak information interaction between texts,and insufficient ability to capture deep level association relationships between different sentences,which affect the accuracy of text semantic matching tasks.To address the above problems,we conduct research work on text matching methods based on multi-granularity and siamese interaction,and the main research contents are shown as follows:(1)A multi-perspective text matching model based on convolution of multigranularity features is proposed in this paper to address the problems of single text grain size features,insufficient capture of multi-granularity potential semantic information,and weak interaction matching of coded features.First,we extract three granularity features of characters,words,and associated phrases by multi-mode segmentation and perform initial encoding.A bidirectional gating recurrent unit is used to initially extract the context semantic information in the above coding information.Second,we construct a multigranularity high dimensional coding matrix,and use convolutional neural networks to capture granularity features,which can improve the representation ability of multigranularity semantic information.Finally,the multi-granularity convolution matrix of two texts is cross cosine matched using a multi-perspective matching pattern to enhance the interaction of multi-granularity feature information.In the experiments with the oral expression dataset LCQMC and the financial dataset BQ,the Acc values of the model reached 87.06% and 85.04%,respectively.The ACC values of our model improved by 1.23%and 0.85%,respectively,relative to the best performance in the comparison model.The experimental results of the model outperform the currently available non-BERT type text matching models.(2)A text matching model based on siamese interaction and fine tuned representation is proposed in this paper to address the problems of insufficient encoding ability of sentence pair vectors,lack of interaction of semantic information between text sequences,and neglect of the role of tags in semantic representation.First,we build a siamese structure embedded with soft align attention(Sa Attention)mechanism and Bi LSTM(Bidirectional Long Short-Term Memory)network.It accomplishes semantic alignment and information interaction through Sa Attention mechanism,and uses Bi LSTM to extract contextual semantic relationships within the text.Second,two texts to be matched are formed into a single sentence pair of text,which is input into Ro BERTa model for initial coding.The LSTM-Bi LSTM structure is used to enable deep interaction and fusion between the sentence pairs within the sequence.Finally,the label monitoring method is used to fine-tune Ro BERTa’s encoding vector to generate more representative feature vectors.In the experiment of dataset LCQMC and BQ,the Acc values of the model reached 89.62% and 85.80%,respectively.The ACC values of our model improved by 1.24%and 0.20%,respectively,relative to the best performance in the comparison model.The experimental results of the model outperform the currently available BERT-type text matching models.
Keywords/Search Tags:Text Matching, Multi-granularity Feature Convolution, Multi-perspective Matching, Siamese Interaction, Fine-tuning Mechanism
PDF Full Text Request
Related items