Font Size: a A A

Query Expansion Technique Based On Association Rules

Posted on:2013-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:T LiFull Text:PDF
GTID:2248330395480676Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the quantity increase of Network information, And only a small portion of theinformation on the web is truly relevant or useful for a person, so it is difficulty to findinformation which is people needs on Internet. And general search engines have presented someserious limitations, such as inaccurateness and so on. In order to solve these problems, thisdissertation extensively studied algorithms for query expansion technique based on associationrules according to the view which is modify the original query to enhance retrieval ability. Themain content, contribution and innovation in the dissertation are described below.1. First of all, We introduce the basic content about data mining, association rules. Thenanalysis the existing query expansion algorithms based on association rules mining and foundout the advantages and disadvantages in common drawback that is the existing query expansionalgorithm based on association rules do not pay attention to the efficiency of the association rulemining algorithm.2. To overcome the shortcoming of the existing query expansion technique based onassociation rules analysis, query expansion algorithms based on the maximum frequent itemsetsmining were presented in this paper. Using the vector space model query technology, and dealingwith the words segmentation about the several documents in initial search. Store the result set invertical and get support of items by intersections. In addition, we also use data structure ofenumerate trees and some strategy of pruning to mining the maximum frequent itemsets. Get thewords of candidate extended word from the association rules. To do the second retrieval by usingthe new query set. The results of the experiment show that the algorithm promotes theeffectiveness of retrieval.3. Because the maximal frequent itemsets lost the support of many frequent itemsets, andthe query expansion algorithm algorithms based on the maximum frequent itemsets mining didnot consider the weight the original query term and extended term. To address the problem, wepresented another new query expansion algorithm which was based on the frequent closeditemsets mining. The algorithm use HT-sturct hyperlink structure and some strategy of pruning tomining the frequent closed itemsets in depth-first. The algorithm calculate the weight accordingto the confidence of the association rules. Experimental results also show effectiveness of thealgorithm.
Keywords/Search Tags:search Engine, query Expansion technology, association rule mining, maximumfrequent itemsets, frequent closed itemsets
PDF Full Text Request
Related items