Research And Application Of Text Similarity Calculation Method Based On Structured Representation Learning

Posted on:2024-07-28

Degree:Master

Type:Thesis

Country:China

Candidate:X M Chong

Full Text:PDF

GTID:2568307157974969

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of mobile internet,text information on the web shows an explosive growth trend,and how to mining the useful information from massive text while filtering out duplicate content has become an urgent problem to be solved.Solving this problem usually involves text similarity calculation,and text semantic representation is an important factor affecting similarity calculation.Text structured representation is a significant approach for text semantic representation.The structured representation generated by this approach is capable of effectively illustrating the dependency relationships among text semantic blocks,precisely conveying the semantic content,semantic center,and theme of a text.Furthermore,research indicates that the semantic representation structure varies depending on the task,and constructing a structured semantic representation tailored to a specific task is crucial for successful processing.However,external parser typically create a generic semantic representation structure,and their use can transform the model architecture into a pipeline style,with errors propagating to later stages of processing,hindering the ability to achieve global optimization and ultimately compromising the model’s performance.As a result,this thesis focuses on investigating structured representations for text and their corresponding similarity calculations.The main work of this thesis is as follows:（1）This thesis proposes a text similarity calculation method based on the Gumbel-Tree-LSTM model.The method involves using a BERT pre-training model to obtain word embedding,followed by utilizing Gumbel-Tree-LSTM to generate a structure tree and obtain structured embedding of the text.The embedding is then sent to the MLP for similarity calculation.In contrast to using an external parser to construct a general parse tree,the proposed method generates a structure tree specific to the task of text similarity calculation.Experimental results demonstrate that this method outperforms classical similarity calculation methods.（2）To address the issue of excessive levels of structural trees and the challenge of constructing complex structures in lengthy texts,this thesis proposes a text similarity calculation method based on a cascade model.Specifically,the method utilizes a stacked Gumbel-Tree-LSTM model to generate a structure tree,where the low-level part analyses dependencies between words in a clause,and the high-level part analyses dependencies between clauses.Experimental results indicate that this method achieves higher accuracy and₁value than the former method based on a single layer model.（3）To tackle the challenge of learning the dependency relationships and structures between clauses in the high-level part of the cascade model,and to prevent the"shortcut"in model training,this thesis presents a text similarity calculation method based on auxiliary task learning.The method utilizes auxiliary task constraints parameter learning and facilitate the high-level part of the model acquiring semantic dependencies and structures.Experimental results indicate that this method can further enhance the performance of text similarity computing tasks.（4）Finally,the thesis also applies the text similarity calculation method to design and implement a Xi’an tourism knowledge Q&A system.The Q&A system facilitates accurate matching of relevant content from a semantic perspective and return it to the user.Overall,this system can assist users in obtaining useful information about Xi’an tourism efficiently.

Keywords/Search Tags:

PDF Full Text Request

Related items

1	Research Of Short Text Representation And Similarity Judgment In Deep Learning
2	Research And Application Of Similarity Calculation In Mixed Long And Short Texts
3	Study On Similarity-based Text Clustering Algorithm And Its Application
4	Research On Text Representation Model And Similarity Calculation Algorithm
5	Research On Personalized Recommendation Algorithms With Auxiliary Information
6	Research On Text Similarity Calculation Method And Its Application In Financial Field
7	Research On The Method Of Microblog Text Similarity Calculation Based On Weighted Word2vec
8	Research And Implementation Of College Enrollment Question And Answer Service System Based On Deep Learning
9	Research On Calculation Method Of Text Similarity Based On Deep Learning In Intelligent Question Answering System
10	Research On Short Text Similarity Calculation Method Based On Siamese Structural Model