Font Size: a A A

Retrieval Model Based On Deep Learning

Posted on:2019-11-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2428330566498100Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
At present,we are at the climax of deep learning and development.Deep learning has achieved great success in image recognition,speech processing,and machine translation.Information retrieval as a research field which is highly related to natural language processing is also affected by this wave of technology improvement.At present,the number of papers related to neural networks in SIGIR is increasing rapidly.Neural networks become the leading edge of information retrieval research.The current deep retrieval models can be divided into two categories: models that focus on learning and models that focus on matching learning.The matchingfocused model becomes the focus of recent research.It uses the similarity matrix to describe the matching relationship between the query and the document.It has the advantages of small training data requirements and good performance on long textsThis paper is based on DRMM which is a representative model for interaction learning and focuses on the topic of depp retrieval model.It studies how to construct a similarity matrix,how to extract relevance information from a similarity matrix,and how to rank documents based on relevance information.This paper compares the effect of word embedding sources on the performance of the model.This paper wants to find whether word embedding pre-trained on largescale corpus or word embedding pre-trained on the corpus in the same domain is better.This paper attempts to improve the existing metrics by nonlinearly transforming cosine similarity.This article also tries to obtain a better measure of similarity by replacing the cosine similarity by quadratic type or MLP.This paper compares the difference between the extended query terms predicted by the traditional pseudo relevance feedback technique and the extended query terms calculated by the word embedding.In this paper,we combine the pseudo feedback technique with the existing model through the weighted method and study the effect of the query word expansion technique on the performance of the model.This paper compares the differences between similarity modeling based on distribution statistics,similarity modeling based on convolutional neural networks,and similarity modeling based on discourse modeling.This paper studies the effect of different kernel functions on the distribution statistics-based model.In this paper,the phrase-level matching information is integrated into the similarity matrix through a convolution operation.This article then compares the performance of the model based on the convolutional neural network before and after using the pooled layer alone and using the convolutional layer and the pooled layer together.This article explores how to integrate traditional and effective partial text information into existing models.This paper disassembles the document into a fixed-length chapter and captures the similarity signal within the text,and then uses a recurrent neural network to integrate the text's similarity signal into a document score.This article describes the overfitting problems existing in the existing model training process and attempts to solve this problem through regularization and adjustment of model parameters.This paper verifies the performance of several similarity signal fusion methods in this model.This article attempts to turn querylevel features into document features by weight and eventually integrate them into document scores.This article also tried to use a fully connected network and LSTM to integrate the feature representations of different query terms.This article explores how to combine multiple similarity modeling techniques to achieve better performance.Specifically,this paper tries to replace the pooling layer in the convolutional neural network with the distribution statistics method.This article also tried to use the distribution statistics method instead of the pooling method in the discourse modeling.
Keywords/Search Tags:Deep Learning, Retrieval Model, Pseudo-relevance Feedback, Passage Retrieval
PDF Full Text Request
Related items