Font Size: a A A

Text Matching Based On Ensemble Learning And Deep Learning

Posted on:2021-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:H ChenFull Text:PDF
GTID:2428330620964183Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the gradual improvement of China's informatization construction,people need more intelligent and accurate services in the field of artificial intelligence such as information retrieval and automatic question answering.In order to continuously improve the performance of algorithms to provide more efficient and comfortable services,a large number of researchers have invested in the research of natural language processing.Text matching is the core and basic problem in the field of natural language processing.It has experienced from the traditional statistics-based traditional text matching methods to the recent deep text matching methods.This thesis studies several popular deep learning matching methods,including text matching with single semantic expression,multiple semantic expression and attention mechanism.Based on the currently widely used algorithms,we proposed Multi-Chanel MatchPyramid,Recurrent Attention Matching Model and Dynamic Parameter Stacking.The main work of this thesis includes:First,we proposed Multi-Chanel MatchPyramid(MCMP).The MCMP model belongs to a multi-semantic expression text matching model.To address the problems of information loss during the matching process of most existing expression-based text matching models,the MCMP model incorporates Multiple channels to get the match score,word importance,contextual information and location information.The experimental results on two sets of experimental data show that the MCMP model is superior to other expression-based text matching models,which proves that the multichannel text matching method is effective.Secondly,we proposed Recurrent Attention Matching Model(RAMM).The RAMM model is composed of multiple matching modules with the same structure.Each module uses the attention mechanism to perform matching and coding to obtain multi-level matching information.And multi-level semantic matching information is fused to get the final result.The experiments show that the RAMM model in the two sets of experimental data are significantly better than other attention-based models,proving that the multilevel semantic matching information is effective.Finally,we proposed Dynamic Parameter Stacking(DPStacking).For the secondary model of the Stacking algorithm,the original text features cannot be learned.The DPStacking algorithm generates parameters of the secondary model through a parameter generator,and the input of the parameter generator is the statistical characteristics of the text.This design,on the one hand,allows the secondary model to learn the connection between the original text features and the real labels.On the other hand,the secondary model can learn the relationship between the original text features and the performance of each primary model,and dynamically generate the weight parameters of each primary model according to the features of the text.This thesis compares experimentally various ensemble algorithms such as Bagging,Stacking,and DPStacking.Experiments show that the indicators of DPStacking algorithm is significantly better than other ensemble learning algorithms on the two sets of experimental data.
Keywords/Search Tags:deep learning, natural language processing, text match, attention mechanism, ensemble learning
PDF Full Text Request
Related items