Research On Keyword Search Based On XML Data

Posted on:2013-05-16

Degree:Master

Type:Thesis

Country:China

Candidate:J Cui

Full Text:PDF

GTID:2248330362462565

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

As the standard of network data representation and exchange, XML has been widelyused in a large number of practical applications and more and more structured orsemi-structured data has been represented and transmitted in XML format. As a simpleand effective way to obtain information, the keyword search techniques for XML data hasalways been a hot issue for researchers. With the same important as validity, the efficiencyof search processing is another key issue for XML keyword search techniques applied inthe need to provide real-time feedback for the large number of users in the networkenvironment, so efficiency is particularly important. In this paper, we aim at solving theproblems of inefficiency in existing XML keyword search techniques, the main researchof this paper are as follows.Firstly, the paper has in-depth analysis of the existing keyword search algorithms andsummarizes that the fundamental reason of low efficiency of the existing algorithms is thatthey need to process all nodes in the sets of inverted lists more than once. In addition, theyobtains the node labels of the element’s ancestors or descendants by frequent comparisonof Dewey encoding in building a query result subtree, so it brings additional cost of time.Secondly, for the problem in existing methods, in this paper, we propose a novelmethod, named fast group, to reduce the times of scanning the inverted lists, then proposea algorithm, named FastMatch, based this method. The algorithm constructs all subtreeresults meeting some certain conditions by scanning all nodes of the set of inverted listsonly once. In addition, we propose that a new pretreatment scheme improve systemperformance obviously by adding path index in Dewey encoding.Finally, we have implemented two algorithms, including MaxMatch and FastMatch.Then we demonstrated feasibility of the algorithm towards the number of matched nodes,query time and scalability in different data sources. The experimental results verify thehigh performance of our method through comparing with MaxMatch.

Keywords/Search Tags:

XML, keyword search, efficiency, Dewey encoding, fast group, FastMatch

PDF Full Text Request

Related items

1	Research On Slca-Based Keyword Search Over XML Documents
2	Study And Implementation On An Improved Approach Based On Dewey Coding For XML Meaningful Fragment
3	Research On Top-k Keyword Search Algorithm In Probabilistic XML Document
4	XSemantic: The Research Of Keyword Search On XML Documents Based On Keyword Expansion
5	Research On XML Keyword Search Processing Method Based On SLCA Sematic
6	Research On Performance Optimization Of Keyword-Driven Analytical Processing
7	Research On The Encrypted Data Supporting Synonymous Multi-keyword Fuzzy Search In The Cloud Computing
8	Fast Keyword Spotting In Handwritten Chinese Documents
9	Research Of Keyword Search On Large Scale RDF Data
10	Research On Group Steiner Tree Based Search Algorithms Over Knowleage Graphs