Research On Semantic Similarity Calculation Of Short Text Based On Neural Network

Posted on:2021-05-28

Degree:Master

Type:Thesis

Country:China

Candidate:C Yang

Full Text:PDF

GTID:2428330623468533

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

With the advent of the information era,a large amount of text information data has shown explosive growth.How to extract effective information from a large amount of text data has become an important task in current research.As the key technology of natural language processing,short text semantic similarity calculation is widely used in the fields of text information retrieval and intelligent question answering system.Short text semantic similarity calculation refers to a given two short text sequences and then calculates the semantic similarity between them.At present,there are two methods to study the semantic similarity of short texts.The first method is based on statistics,such as Vector Space Model.The second method is based on Neural Network,such as a DSSM model or a pre-trained BERT model.The BERT model trains a language model on largescale corpora,learns the general representation ability of words,and has achieved better results on various natural language processing tasks.In this thesis,the BERT model is deeply studied.This thesis argues that although the BERT model is powerful,there are still some shortcomings in the problem of short text semantic similarity.For example,the context information in other positions are ignored.Based on investigating the short text semantic similarity algorithm in recent years,this thesis studies the short text semantic similarity algorithm,and proposes a novel algorithm to aggregate all the context information of the BERT model.This thesis first proposes a BERT fine-tuning model based on multiple attention mechanisms-LSTM aggregation networks.The model encodes the text sequence by the BERT model and uses multiple attention functions to calculate interaction information.Finally,the text similarity vector aggregated by multiple attention mechanisms and LSTM network and the coding information of [CLS] position.This method uses different attention mechanisms to extract relevant information from other positions,so the results on the three standard datasets exceeds the BERT model.At the same time,in all improved BERT fine-tuning models,the model has obtained the best results at present.In addition,we also conduct ablation experiments by removing any of attention functions,and analyzed the impact of different attention functions on the final prediction results.The experimental results show that all kinds of attention functions have important influence on the model,but different attention functions have different effects on the results.In addition,a BERT fine-tuning model based on MatchPyramid structure is proposed.This model improves the traditional method of extending BERT model with convolution neural network.Similarly,the text sequence pairs is joined into a text sequence,and then the text sequence is encoded by the BERT model.Different from convoluting the text coding sequence directly with one-dimensional convolution network,this model matches the coding vectors of words between two sentences,and then uses two-dimensional convolution and pooling to extract the matching information.Finally,the information extraction and encoded information in the [CLS] position are fused to calculate the text semantic similarity.This method uses convolution network to deal with the feature matching matrix,taking into account the matching information of words between sentences.So experimental results of paraphrase identification and natural language inference task is better than the traditional method of using convolution neural network to improve BERT.

Keywords/Search Tags:

neural network, attention mechanism, BERT model, text semantic similarity

PDF Full Text Request

Related items

1	Research On Text Semantic Similarity Based On Deep Learning
2	Attention-based BertCNN: A Method For Text Similarity Calculation
3	Research On BERT-based Chinese Long Text Classification Algorithm
4	Research On Short Text Similarity Based On Deep Learning
5	Research And Implementation Of Text Summarization Technology Based On Semantic Understanding
6	Research On Text Sentiment Analysis Method Based On BERT And Hybrid Neural Network
7	Research On CNN-based News Headline Similarity Calculation Model
8	Research On Chinese Text Summarization Technology Based On BERT-KA-PGN Model
9	Research On Text Representation And Classification Based On Neural Networks And Self-attention Mechanism
10	Research On Text Emotion Classification Based On BERT Embedding