Font Size: a A A

Study On Feed Back In Information Retrieval

Posted on:2008-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:J X DengFull Text:PDF
GTID:2178360215990240Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the processing of information retrieval , the unsatisfactory result always due to the uncertain demands . For a more effective retrieval system , we should take some actions to adjust the research strategy to find the precise demand. Recently, researchers bring out different feedback methods based on different models. And there are some important models we should to know , such as vector space model or probability model . Based on the careful analysis in feedback technology , we put out some innovations as the following :Firstly we analyze the vector space model and the feedback technology based on it . As a result we find , the precision both in retrieval processing and in feedback technology are influenced by the precision of primitive matrix data . Traditionally , the data in matrix always be induced from tf-idf method or the improved one . But every traditional method does not take an important factor into account. That is , a word in web document which lie in the middle of a html tag maybe have the different importance contrast by the same one in other html tag . So the data which comes from traditional method can not reflect the real charactors of documents. An arithmetic , which is designed for calculate the tag's influence factor , was put forward in this article . It is deduced from the analysis on plenty of documents and provide more accurate tag influence factor . Using the factor to improve the primitive matrix . And we find some improvements in precision by check the result of experiment.The most important analysis is done on the probability latent semantic model . To begin with , we improve a method , which is designed for choosing initial k points , by the means of radiative fields . And we provide the realistic methods for getting prarameters's value . Based on the probability latent semantic model and combined with the improved cluster arithmetic , a new feedback processing was put forwards .The input data in this research is the original data in the probability latent semantic model, and the result which deduce from k-centre point arithmetic ( Partitioning around medoid, PAM ) is used to simulate the latent semantic kind . Because the result of cluster is only connected with the input data and the arithmetic , which intiate the probability latent semantic model , is just related to the primitive date too . If the number of kinds in cluster be equal to the one in the probability latent semantic model . we can use the cluster to simulate the latent kinds . Meanwhile , we put forward the method for kind title generating and the one for query words expanding. Through experiment , the retrieval precision and the recall are both improved and it shows that the feedback system in PLSM is better than before.
Keywords/Search Tags:information retrieval, relevant feedback, tag's influence weight, radiation field, kind title generating, query words expanding
PDF Full Text Request
Related items