Font Size: a A A

Research On Short Text Similarity Based On Deep Learning

Posted on:2021-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:X T JiangFull Text:PDF
GTID:2428330647459595Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
Short text similarity research is a branch of text classification,which plays a fundamental role in tasks such as intelligent question answering and information retrieval,and has certain research value.The traditional text similarity calculation method has problems of dimensional explosion and unknown semantics,which cannot meet the current needs.This paper studies the similarity of short text based on deep learning technology,specifically selects Bi-LSTM model and BERT model for short text similarity research,and related improvements are made according to its shortcomings.The main work of this article is as follows:(1)When establishing a Bi-LSTM model based on word vectors,it is found that there are three shortcomings in the calculation of short text similarity.First,word vectors are affected by word segmentation ambiguity and unregistered words,which easily leads to semantic errors.In response to this deficiency,this paper proposes an improved method of replacing word vectors with a combination of characters and words to enhance the semantics of the model.Second,the structure of the LSTM gate mechanism cannot highlight the key information of the sentence,so the feature extraction capability of the model needs to be improved.In response to this shortcoming,this paper introduces a self-attention mechanism to improve the feature learning ability.Third,the loss function is easy to cause the model to be too confident in the prediction results,and the generalization ability of the model needs to be improved.In response to this shortcoming,this paper introduces the label smoothing regularization method to improve the loss function.(2)When building a BERT model based on character vectors,it is found that the model has two shortcomings in the calculation of short text similarity.First,the semantic expression of character vectors is not accurate enough,and the model is prone to semantic loss.To address this deficiency,this paper proposes to use Chinese and English joint input to improve the semantic deficiency of character vectors,and at the same time,a multi-head attention mechanism is also introduced to further learn bilingual features.Second,the loss function is not good at measuring the prediction effect,and the model is not easy to adjust.To address this deficiency,this paper introduces a label smoothing regularization method to improve the loss function.The experimental results show that the Macro-F1 of the improved Bi-LSTM model is2.63% higher than the original model;the Macro-F1 of the improved BERT model is 3.2%higher than the original model.The model improvement measures proposed in this paper are effective.According to the experimental results of this paper,the prediction effect of the BERT model is better than that of the Bi-LSTM model,and its Macro-F1 is 7.71% higher than that of the Bi-LSTM model.
Keywords/Search Tags:short text similarity, Bi-LSTM model, BERT model, multi-head attention mechanism, label smoothing regularization
PDF Full Text Request
Related items