Font Size: a A A

Research On XML Keyword Search Processing Method Based On SLCA Sematic

Posted on:2013-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:G X LanFull Text:PDF
GTID:2248330392954864Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As a kind of effective data query method, keyword search has been an importantissue of XML data management research. For a given keyword query Q, lowest commonancestoris the basis of existing XML keyword search semantics, of which the most widelyfollowed variant is smallest LCA. It is of great significance to research on XML keywordsearch methods based on SLCA sematic and improve the methods’ effiency and precison.Firstly, the problem of repeatedly processing common ancestor nodes is the mainreason for influencing the efficiency of SLCA computation. To facilitate it, we propose atop-down query processing strategy. Based on it, we present an efficient algorithmLPSLCA which recursively divides the set of inverted lists in a top-down manner.Secondly, LPSLCA has overcomed the problem of repeatedly processing commonancestor nodes. It still has redundancy operation when it skips the processed commonancestor nodes on another invert list. We propose a top-down SLCA computationalgorithm based on hash search, called TDHS. It finds the IDDewey lables which containthe processed common ancestor nodes’ ID and skips them using hash table. TDHS notonly has solved the LPSLCA’ problem, but also reduced the time complexity.Thirdly, we design a new inverted list CList based on column storage and give analgorithm TDCOL based on CList. It has solved the problems of redundancy computationand repeatedly checking hash table of the algorithms which are based on traditionalinverted lists. For further reducing the time complexity, we propose a new algorithm usinghash table. It is aslo based on CList and called TDCOLHS. It converts the binary searchoperation on another inverted list to checking hash table operation. Then we giveTDCOLHS+to optimize TDCOLHS for avoiding redundancy computation of it when thenumber of result is far less than the length of the shortest inverted list.At last, the experimental results verify the performance advangtages of our methodaccording to various evaluation metrics.
Keywords/Search Tags:XML, Dewey, SLCA, hash, clist
PDF Full Text Request
Related items