Font Size: a A A

Research And Implementation Of Keyword Query Technique In A Seamless Intergrated Relation-XML Database System

Posted on:2012-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhaoFull Text:PDF
GTID:2268330425491583Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With XML becoming the standard of data representation and data exchange, XML is applicated in more and more fields widely; the number of XML documents is increasing gradually, how to search results which are satisfying users from a large number of XML documents has been become an importance research in database area.The thesis, on the background of the national863database major pojecf"A Seamless Intergrated Relation-XML Dual Engine Database Management System and Demonstration Application", designs and implements the function of XML keyword query. This thesis divides a keyword query into the part with complex structure and the part without complex structure. Keyword query with a complex structure is called structured query, which can accurately express the user’s query requirements, but also depends highly on users’ skill, requiring users to master complex query language, understand the schema information of XML documents, this search is for skilled programmers and database administrators. Keyword query without a complex structure is called keyword query, which depends low on users’skill, is for ordinary users, users simply enter one keyword or more keywords of interest, then the system will returen results which meet the users’ query intent.This thesis firstly introduces the design of keyword query, including XmlInfoRelation table structure which stores XML documents, inverted index which stores all words and all kinds of information about words, algorithms which are used in structured query and keyword query. XmlInfoRelation table contains text content and shema information of XML documents, which is another representation of XML documents, inverted index is builded on text column in the table, inverted index includes detailed location information of every word in the XML document. On the basis of above, the thesis proposes a new keyword query algorithm CoSQLRXSE for the system, and compares it with ILE algorithm.Next, the thesis gives the concrete implementation of keyword query process. A XML keyword query instance gives the data structures and algorithms related to keyword query, and then introduces the data structure and algorithms specifically from three aspects, including information extraction from inverted index or from scaning XML document, judging the information according to querying conditions, returning the querying results of XML document or document fragments to users.Finally, we carried out experimental tests and analysis. We adopted XMark benchmark, testing keyword query using different specifications of XML document. Experimental results show that the inverted index can significantly speed up the search speed; the proposed algorithm can combine the characteristics of the system, and return search results efficiently.
Keywords/Search Tags:XML, keyword query, structured query, inverted index
PDF Full Text Request
Related items