Font Size: a A A

Research Of Query Expansion Based On Formal Concept Analysis

Posted on:2008-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y F HaiFull Text:PDF
GTID:2178360212495654Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Internet has become one of the main information sources in daily live. The primary tool of searching information is search engine. Today, the results of most search engine are based on exact matching between query terms and web pages.One of the most important tasks of search engine is presenting more additional relevant web pages and reducing those web pages which are useless for user, i.e. improving the search precision of search engine. Query expansion is an efficient method for this. To the large-scale information retrieval such as search engine, query expansion techniques in existence are inefficient because of the particularity of web pages' structure. So improvements of query expansion techniques in existence become a hotspot.This thesis used method of formal concept analysis and researched the query expansion of search engine based on the data structure named concept lattice. And then a new strategy for expanding query terms based on formal concept analysis was proposed. This strategy can improve the intelligence of search engines. Under this strategy, web pages in lower adjacence set of user's requirements were described with fonnalization and then formed page-term formal context. Based on this formal context a concept lattice was built which was the main data structure of our strategy of query expansion. After this, a method of mining minimal non-redundant association rules' between query terms and non-query terms was proposed based on minimal generators on concept lattice. About this method, the concept of minimal generator, generation of minimal generator and the relations between minimal generator and closed terms set were discussed in detail. And then this thesis gave the algorithms of mining minimal non-redundant association rules whose confidences equal to 100% and less 100% between closed terms set and their minimal generators based on concept lattice. This method is very efficient for search engine because a lot of redundant association rules are discarded during the mining. To other mining method, this method is more efficient.For testing the method in this thesis, an experiment coded in C++ was performed. In the experiment, five groups with different count of query terms and their lower adjacence web page set were adopted. The experiment indicated that the process times was acceptable for users when the count span of query terms between 3 and 7 and the count of terms which represented the web page they appeared were 100. We also ran an experiment on the Mushroom dataset. The results were compared with the results of F. A. Grootjen's expansion method and the results of Zaki's rules mining method. The experiments showed that this query expansion method was practical.
Keywords/Search Tags:Search Engine, Formal Concept Analysis, Association Rule, Query Expansion
PDF Full Text Request
Related items