Font Size: a A A

Research On XML Stream Data Processing

Posted on:2013-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q DengFull Text:PDF
GTID:2248330362468658Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
XML(eXtensible Markup Language) is a meta language allowing users to define their own markup language. With the development of Internet, XML has been used widely for data presentation, storage and exchange due to its simplicity and flexibility. W3C(WWW Consortium) published XQuery as a standard for manipulating XML data. The implementation technology of XQuery gains widespread attentions because of the requirement of large number of XML data processing, and it has become a research hot spot of database related area.XML data is semi-structured and can be viewed as label tree. An important part of queries expressed in XQuery addressed the structure features of XML data. The research of processing XML structure query went through three stages from binary structure join, to process each path then combine the results, to holistic pattern matching. Holistic pattern matching abstracts the query as pattern trees, and evaluate with efficient specific algorithm. Holistic pattern matching conforms to the tree structure of XML data, and becomes a key technology of XQuery implementation.Recently many Internet applications(network monitoring, pub/sub system, online auction) employed a new streamed data model, stream data possesses the following characteristics:data elements arrive online, and query to the stream may be blocked; the processor has no control over the order in which data elements arrive to be processed; data streams are potentially unbounded in size; once an element has been processed it is discarded or archived, it cannot be retrieved easily unless explicitly stored. For XML data stream and its processor, the general features are as follows: data element arrives as XML token, queries are expressed in XPath/XQuery, the processor are driven by SAX(Simple API for XML). The overall goal of XML data management is to effectively process online arriving data and complex queries with limited memory consumption.This paper analyzed the development, current status and existing problems of XML stream data processing, proposed a twig pattern matching algorithm combining automaton and holistic pattern matching running on XML streams, this algorithm is able to extract XML fragment addressed by twig pattern from ordered accessed XML stream data, achieved quick, incremental query answering within a small amount of memory consumption. Aiming at the blocking operations in stream environment, this paper also designed block solving strategy for common blocking operations(which can only output precise answers until read their whole input). The XQuery processor for XML data stream will increase in functionality and flexibility if equipped with this block solving strategy. This paper is helpful with an XML data stream processor’s construction and the efficient implementation of XQuery in different scenarios.
Keywords/Search Tags:XML, stream data processing, XQuery, twig pattern matching, automaton
PDF Full Text Request
Related items