Font Size: a A A

Research On Long Text Matching Based On Concept Interaction Graph

Posted on:2022-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:S Z GuoFull Text:PDF
GTID:2518306755960869Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
With the vigorous development of the Internet,social networks and we media have become the main ways for netizens to share knowledge and publish stories,and a large amount of long text data has been generated in the Internet space.Long text data contains more semantic information and complex logical information,which makes the modeling and matching of long text more difficult and challenging.Long text matching is the core task of Natural Language Processing.Therefore,the research on long text matching method is of great significance and application prospect.This paper studies and designs the problems of Concept Interaction Graph model(CIG)in long text matching task.The main research contents and innovations are as follows:(1)Aiming at the problem that CIG lacks keywords in text sentences in long text matching task,resulting in low accuracy of text content decomposition,a conceptual interaction graph model(CCIG)integrating word vector clustering is proposed.The model introduces word vector clustering,takes the clustering results as the filler of keywords,and improves the accuracy of text decomposition by using multiple words as the decomposition basis.The text matching experiments are carried out on the public Chinese long text data sets CNSE and CNSS.Compared with the CIG model,the F1 value is increased by 0.88%and 1.01%respectively,which shows the effectiveness of adding word vector clustering for text decomposition.(2)Based on the above model,aiming at the problem of insufficient representation of text sequence information in the encoder design of CIG model,by introducing the cyclic neural network structure and keyword information,the gated cyclic unit encoder(Siamese-GRUkey)is redesigned,and a conceptual interaction graph model integrating keywords and gated cyclic unit is proposed to further improve the effect of long text matching.Text matching experiments are carried out on CNSE and CNSS data sets.The F1 values reach 84.51%and 91.20%respectively.Compared with the previous CCIG model,the F1 values are increased by 1.64%and 0.84%respectively,which shows the effectiveness of the designed gated loop unit encoder.
Keywords/Search Tags:Long Text Matching, Concept Interaction Graph, Word Embedding Clustering, Gated Recurrent Unit
PDF Full Text Request
Related items