Study On Some Key Techniques Of Non-fully Structured XML Query Processing

Posted on:2007-09-24

Degree:Doctor

Type:Dissertation

Country:China

Candidate:X G Li

Full Text:PDF

GTID:1118360185977713

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the rapid development of technique of Internet/Intranet, and the technique of heterogeneous information integration and storage, there are huge amounts of semi-structural data such as XML document emerging in the network. Due to the properties of self-description and flexible data structure, it is becoming one of the standards of data definition, storage and exchange. As a key technique of effective management of XML documents, non-fully structured XML query processing has been focused on by more and more researchers recently.Non-fully structural XML query (NFS) is a technique of querying XML documents lack of fully structural information. NFS query faces the situations that user doesn't know fully the structural knowledge of an XML document, or a document doesn't provide any structural information, or documents are heterogeneous. Under each situation, a user can't write a regular query to express his intention accurately. In practices, especially in Internet/Intranet, most of XML documents are lack of structural information or heterogeneous, so NFS query becomes more and more popular in recent years. This dissertation deeply studies two key techniques of non-fully structured XML query processing: the determination of meaningful query result and the content based result clustering.The determination of meaningful query result is a very important step for NFS query. Most of the determinations in previous works, such as Interconnection Relationship in XSEarch system and MLCA in Timber system, are proposed from a special view, so they are applied to some kinds of XML documents only. Moreover, they became infeasible for large scale documents, such as both the time of establishing the index in XSEarch and the time of querying in Timber is far beyond user's tolerance.This dissertation proposes a general determination model based on the concept of pattern and instance, called as PE model. The PE model is a system-oriented model and can be accepted widely by users. In fact, the PE model is just a scalable framework and independent of the definition of equivalent pattern and equivalent query term. Under the framework of the PE model, this dissertation proposes a structure similarity based method to compute equivalent pattern, and put forwards a determination rule. To improve the efficiency of NFS querying, this dissertation...

Keywords/Search Tags:

XML Document databases, XML query, Non-fully structured XML query, Document clustering, Clustering skew, Feature reduction, Information theory

PDF Full Text Request

Related items

1	Research On Efficient Document Clustering Using Improvised Sub-Document Based Framework
2	Research On Query Processing On XML Data
3	Search term selection and document clustering for query suggestion
4	Research Of Query On The Probabilistic XML Document
5	Measuring the stability of query term collocations and using it in document ranking
6	A comparative study of keyphrase-based query-specific clustering on WWW
7	Visualization of search engine query result using region-based document model on XML documents
8	Study On Clustering For XML Document Collection
9	Efficient structural query processing in XML databases
10	Research Of Document Retrieval System Based On Fuzzy Query Technology