Font Size: a A A

Research On Keyword Search Based On XML Data

Posted on:2013-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:J CuiFull Text:PDF
GTID:2248330362462565Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
As the standard of network data representation and exchange, XML has been widelyused in a large number of practical applications and more and more structured orsemi-structured data has been represented and transmitted in XML format. As a simpleand effective way to obtain information, the keyword search techniques for XML data hasalways been a hot issue for researchers. With the same important as validity, the efficiencyof search processing is another key issue for XML keyword search techniques applied inthe need to provide real-time feedback for the large number of users in the networkenvironment, so efficiency is particularly important. In this paper, we aim at solving theproblems of inefficiency in existing XML keyword search techniques, the main researchof this paper are as follows.Firstly, the paper has in-depth analysis of the existing keyword search algorithms andsummarizes that the fundamental reason of low efficiency of the existing algorithms is thatthey need to process all nodes in the sets of inverted lists more than once. In addition, theyobtains the node labels of the element’s ancestors or descendants by frequent comparisonof Dewey encoding in building a query result subtree, so it brings additional cost of time.Secondly, for the problem in existing methods, in this paper, we propose a novelmethod, named fast group, to reduce the times of scanning the inverted lists, then proposea algorithm, named FastMatch, based this method. The algorithm constructs all subtreeresults meeting some certain conditions by scanning all nodes of the set of inverted listsonly once. In addition, we propose that a new pretreatment scheme improve systemperformance obviously by adding path index in Dewey encoding.Finally, we have implemented two algorithms, including MaxMatch and FastMatch.Then we demonstrated feasibility of the algorithm towards the number of matched nodes,query time and scalability in different data sources. The experimental results verify thehigh performance of our method through comparing with MaxMatch.
Keywords/Search Tags:XML, keyword search, efficiency, Dewey encoding, fast group, FastMatch
PDF Full Text Request
Related items