Font Size: a A A

Based On Words Associated With Information Retrieval System

Posted on:2011-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:L K DingFull Text:PDF
GTID:2208360305497623Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Popular Information Retrieval technologies are reviewed, especially VSM, its variants, and also inverted index. After examining the pros and cons of existing methods, a new ranking algorithm is introduced. The algorithm is based on word relations, which reflect the statistical property of a text collection. Word relations are defined by the number of documents containing certain word and word pairs. The algorithm can solve word dependency problem in VSM by adjusting semantic vector according to word relations.An information retrieval system on a medium-scale text collection is established using the word relation algorithm, along with inverted index. It can be accessed by searching keywords, texts and similar articles. Tests show both good precision and satisfying performance. Several design issues and building steps are discussed in detail.
Keywords/Search Tags:Word relation, Information Retrieval, VSM, Inverted index
PDF Full Text Request
Related items