Exploiting Document Boltzmann Machine In Query Extension

Posted on:2018-03-13

Degree:Master

Type:Thesis

Country:China

Candidate:L M Huang

Full Text:PDF

GTID:2348330542484888

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Most work related to query extension(QE)adopted the assumption that terms in a document are independent,and multinomial distribution is widely used for feedback documents modeling in lots of QE models.We argue that in QE methods,the relevance model(RM)which generates the feedback documents should be modeled with a more suitable distribution,in order to naturally handle the term associations in feedback document.Recently,Document Boltzmann Machine(DBM)was proposed for document modeling in information retrieval,and this model can relax the independence assumption,i.e.,can capture the term dependency naturally.It has been shown that DBM can be seen as the generalization of traditional unigram language model and achieves better ad hoc retrieval performance.In this paper,we replace the multinomial distribution in the traditional unigram RM method with DBM,while leaving the main QE framework nearly unchanged to keep the model uncomplicated.Thus,the relevance model is estimated by the DBM trained on feedback documents,called relevance DBM(rDBM).The extended query is generated from the learnt rDBM,and we give the final extended query likelihood according to the parameter values in rDBM.One difficulty in learning rDBM is the problem of data sparseness,which could lead to overfitted rDBM and harm the retrieval performance.To solve this problem,we adopt Confident Information First(CIF)as model selection principle to reduce the complexity of rDBM,which lead our proposed query extension method more efficient and practical.Experiments on several standard TREC collections show the effectiveness of our QE method with DBM and model selection method.In addition,we also optimize the document Boltzmann machine by the Akaike information criterion method.As a result,we reduce the complexity of the model,solve the problem of data sparseness which could lead to overfitted and improve the retrieval performance on several standard TREC collections.

Keywords/Search Tags:

Document Boltzmann Machines, Query Extension, Model Selection, CIF, AIC

PDF Full Text Request

Related items

1	A Method Of Improving Restricted Boltzmann Machines Via Theta Pure Dependency
2	Parameter Choice For Boltzmann Machines:Theories And Applications
3	Research On Document Modeling And Query Expansion For Short Messages
4	Gaussian Distribution Restricted Boltzmann Machines And Its High Dimension Extension
5	Research Of Deep Learning Method Based On Restricted Boltzmann Machines
6	Research On Learning Algorithms For Restricted Boltzmann Machines
7	A Study Of Boltzmann Machines For Classification And Ranking Tasks
8	Study Of An XML Query Language XQuery And Its Extension
9	Study On Boltzmann Machines For Robust Target Recognition
10	Research Of Restricted Boltzmann Machines Based On Smooth L₀ Norm And Its Application