Font Size: a A A

The Research Of XML Based Dynamic Information Trigger Mechanism

Posted on:2003-05-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:H Y XuFull Text:PDF
GTID:1118360092498831Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
There are new requirements of users with the constantly increasing of the data publication on the web, which brings on two remarkable problems. 1) In face of the huge amount of information on the web, how can the users get exactly what they really want, while most of the information they do not care. 2) How to notify the users when the information they are interested is changed.To solve the problems, the system needs two kinds of capabilities. One is that for a given information unit, the system can filter it against the user profiles firstly, and then deliver the useful information in right granularity and form to the users. Another is that the system should identify the change of the information at proper intervals, and notify the users interested in time. Essencially, the capabilities described above implie to establish the association between information and the user profiles, denoted by the information trigger mechanism. The information trigger mechanism can be divided into two kinds of forms, one is static form, and another is dynamic form. The focus of this thesis is mainly given to the latter one.The thesis presents a framework named XFDS, on the basis of change detection technology and subscription matching technology. The framework is designed to monitor the fetching of millions of XML documents per day, while supporting millions of subscriptions.Based on the XFDS, of the thesis addresses two bottlenecks: change detection of XML documents and subscription matching.To the former one, since most previous work depends on the signature of nodes, which would be costly to be computed, a new way is proposed to discriminate those nodes that have the same paths in the intrinsic properties of the XML document. Towards this end, the thesis introduces a series of notions related to the key path, which make the change detection of XML document easier.Based on the notions, a change detection algorithm named KF-Diff is presented, which is tailored to unordered trees. The complexity of KF-Diff is O(nlogn), while n is the number of the nodes in the tree, vs. polynomial time for previous algorithms. KF-Diff makes progress in the matching step and the filtering step, and can produce fairly good results. In addition, KF-Diff can detect all kinds of move operations. Yet, KF-Diff is not efficient enough to be used in the enterprise applications.To solve the problem, the thesis proposes a highly efficient algorithm named KF-Diff+. The algorithm transforms the traditional tree-to-tree correction into the comparing of the key trees, which are substantially label trees without duplicate paths. Thus, the algorithm achieves high efficiency with the complexity of O(n), where n is the total number of nodes in the trees, which is significant to the large scaled applications. Different from the previous work, KF-Diff+ is tailored to both ordered trees and unordered trees.While dealing with the key path, the notion of the key constraint for semi-structure data and the notion of the multi-instance based keys are identified,which will greatly simplify the complexity of the algorithm. The thesis addresses the problems associated with the key constraint, such as satisfiability and implication etc, and furthermore gives some inference rules, proposes determinant algorithms and analyses the algorithms accordingly.To implement the subscription mapping, the abstract model space of the documents is identified at first, which can be tailored to documents with or without schema. Then, the thesis presents a schema based two-level associated model, which dramatically reduces the number of the candidate subscriptions. In addition, an incremental subscription matching algorithm is proposed to efficiently matching events for millions subscriptions.Furthermore, since indexing is very important to the system, the indexing scheme for XML data is defined based on the containment relationship, which can be more efficient to deal with relative paths than previous methods. The key properties of the model are concluded as fol...
Keywords/Search Tags:XML, Key, Change Detection, Subscription, Index, Trigger
PDF Full Text Request
Related items