Font Size: a A A

Research And Application Of Chinese Short Text Matching Based On Pre-trained Model

Posted on:2022-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:J D ZhuFull Text:PDF
GTID:2518306752454094Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Chinese short text matching is a prevalent task in natural language processing,which centers on computing the semantic similarity of two text sequences.From an information-theoretic perspective,similarity is defined as the commonality between two text fragments;the greater the commonality,the higher the similarity,and vice versa.Text similarity is rapidly becoming a key tool in many Natural Language Processing(abbr.,NLP)based tasks,such as information retrieval,automated question and answer,machine translation,dialogue systems,and document matching.Various semantic similarity metrics have been proposed in the past decades.Most scholars classify text similarity metrics based on statistical data,corpora and knowledge bases such as Wikipedia.The existing methods are generally utilized in English,which leads to three limitations:(1)Unable to capture word-level abundant semantic when the input tokens are character sequence.(2)It would be more vulnerable to data sparsity and the presence of out-ofvocabulary(abbr.,OOV)words if utilizing word-based models,and thus more prone to overfitting.(3)The very few approaches that consider both granularities are still limited.To tackle these problems,we propose a Graph Attention Leaping Connection Network to consider both semantic information and multi-granularity information,achieving sufficient information aggregation while alleviating over smoothing.Extensive experiments are also conducted and a demonstration system of the algorithm is given.In summary,the core contributions of this paper are as follows:1.This paper presents an interactive graph neural network model considering both character level and word level.First,the model acquires the original character-level semantic representation based on a pre-trained BERT model.Meanwhile,in order to preserve the multi-granularity information,the model con-structs word lattices as subsequent inputs.Based on this,two major mechanisms,the attention mechanism and layer-hopping connection,are introduced to obtain a better semantic representation of the text.Extensive experiments are conducted on three specialized datasets,and the experimental results show that our model achieves the best performance in the short text classification model.2.In this paper,a neural network model with a twin-tower structure is designed based on the first model.First,we analyze and compare the advantages and disadvantages of the interactive and twin-tower models,and improve the model structure to adapt to the twin-tower design,making some adjustments while keeping the word lattice,the attentional interaction module and the jump connection module.An open source dataset and a private dataset are selected for testing,the former for open domains and the latter for specific domains.The advantages of the model under the tower structure are verified after a series of experiments,discussions and reflections are given,and the problem is analyzed.3.This paper verifies the impact of condition changes on the model and gives a domain knowledge map.The former includes the effects of different modules and training conditions in the model design on the experimental results,mainly exploring the role of the LC module of the aggregation node,the relationship between the early stopping value and the text length,the effect of the path retention rate when constructing the word lattice,and whether the computational probability P2 of the tower BERT needs to be retained These points start from experimental validation,giving conjectures and reflections while proving the value of the model.The latter includes the construction of a domain knowledge graph,including data capture,processing,matching and extraction,as well as the design,construction and display of the graph.Incorporate research into practice and combine different technical modules to give applications in actual scenarios and enhance the value of the model.
Keywords/Search Tags:Short text matching, Chinese word lattice, Graph attention mechanism, Leaping connection, Matching application
PDF Full Text Request
Related items