Font Size: a A A

The Research On Parallel Query Technology Of XML Streaming Data Based On Macro Forest Transducers

Posted on:2018-08-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y X QinFull Text:PDF
GTID:2348330563952313Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The XML data format because of its inherent advantages: scalability,separation of content and form,be convenient for information transmission between different systems and so on,which has been widely used in information management,ecommerce,mobile communications,electronic document exchange and other fields,it has become the standard of data representation,transmission and exchange in the new generation of Internet.With the rapid development of the Internet,many application system based on network produce a lot of dynamic data sets,such as network log information collection and analysis,technical analysis of stock market,Internet security monitoring,location information monitoring and so on,in such a real-time system,there is a large number of dynamic data sets that grow indefinitely over time,the data is called data stream,and the XML flow data is the main form of this kind of data flow.For the flow data,we can not predict the boundary and size of the data,so it is impractical to store all the data.The traditional query mechanism for XML database is no longer applicable to the XML data stream which the characteristics of high speed,ordered,real-time,single scan,which is a new challenge to the XML stream query processing.Therefore,it is a hot issue in the research of XML data stream:how to deal with these data efficiently,and filter,parse and query it.Automata is a model driven by input,which coincides with the XML flow data arriving online and generating events by SAX parsing,and in the previous study of XML processing,there are a lot of method to process XML data using the automaton.In recent years in processing the XML stream data using the forest macro automaton is a finite automaton with XML forest as input and output XML forest,the processing performance has reached a higher level,it is suitable for processing the XML data streams.The query of XML data stream is mainly attributed to the query of XPath,so it is natural for us to apply the technology of automata to XPath query processing,and to model the XPath expression.In this paper,a method for parallel processing of XPath queries based on macro forest transducers is presented,it makes full use of the advantages of the high performance of XML processor and the characteristics of multi-core processor to improve the query efficiency.The method generates a query automaton based on XPath query expression,XML stream is parsed into an event stream through the XML parser,the transducer takes these event streams as input,query transducer will be based on different input events between the various state conversion,and different tasks in the query are assigned to different threads to execute,once a part of the data stream matches the XPath query expression,the query transducer outputs the query result.The XPath scope supported by this method includes the PC axis,the AD axis and any number of parallel predicates and any level of nested predicates,the experimental results show that the parallel query method of XML data can not only support complex queries,but also has good execution efficiency.
Keywords/Search Tags:XML flow, Macro forest transducers, XPath query, Multi thread parallel
PDF Full Text Request
Related items