Contextual Awared Multi-layer Information Retrieval Method Based On BERT

Posted on:2022-01-15

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Luo

Full Text:PDF

GTID:2518306554482714

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of data production methods,a large amount of data is being generated from news recommendation platforms,e-commerce platforms,office automation systems etc.every moment.How to sift the contents that matches the user’s information needs from the massive data has become a hot spot in the current information retrieval field.Nowadays,pre-trained language models have been successfully applied to information retrieval(IR).Since the BERT model can be trained in a large-scale corpus to obtain a universal embedding representation of words,it can provide richer information compared with the traditional bag-of-words model,and has become a basic building block in information retrieval tasks.Nevertheless,there are several limitations when applying BERT to the query-document matching task: 1)relevance assesments are applicable at the document-level,the tokens of documents often exceed the maximum input length of BERT.2)Applying BERT to long documents leads to a great consumption on memory usage and run time,owing to the computational cost of the interactions between tokens.This paper explores a novel multi-layer contextual passage architectual which based on BERT model to break the limits.The main work includes the following two aspects:First,passage-level summarization extraction.We utilize Maximal Margin Relevance algorithm which based on TF-IDF mechanism extract important sentence as the passage-level summarization,which ensure the high relevance and deduce the redundancy of information.Secondly,BERT based multi-layer contextual passage information retrieval.We first take the pasage-level summarization which extracted in first stage as the contextual evidence,and attaches with the document title and original text together compose the multi-layer contxtual passage architecture.Finally,we utilize the sentence pair classification task to predict the relevance score between query and passage.Experiments conducted on two standard ad-hoc retrieval collections from the TREC 2004 Robust Track(Robust04)and Clue Web09 with two different characteristics indivisually,experimental results show that: our method is generally better than the baseline models of the neural ranking models;compared with the other passage-level retrieval models,our method achieves the best results in all metrics,which shows that the use of contextual information of passage can significantly improve the precision of retrieval task.The experimental results verify the effectiveness of the existing BERT’s multi-layer contextual passage retrieval method.

Keywords/Search Tags:

Relevance Matching, Text Summary Extraction, BERT, Neural Ranking Models

PDF Full Text Request

Related items

1	Research On BERT-based Neural Ranking Models
2	Research On Extraction Summary Generation Technology Based On Attention Mechanism
3	Research On Chinese Text Summary Method Based On Deep Learning
4	Research On Chinese Text Summarization Technology Based On BERT-KA-PGN Model
5	Research On Text Extraction Method Based On Key Sentence And Keyword Association
6	A Research On Abstract Summary Extraction Of Long Texts Based On BERT Model
7	Research On Ecommerce Document Retrieval Technology Based On BERT Model
8	A Transferable Approach To Generating Abstractive Text Summary Based On Pre-trained Language Model
9	Research On Chinese Text Summary Extraction Algorithm Based On TextRank
10	Research On Semantic Matching Method Of Chinese Text Based On BERT