A Result Size Estimation Algorithm For Value Predication In XML Query

Posted on:2009-03-26

Degree:Master

Type:Thesis

Country:China

Candidate:Z Wang

Full Text:PDF

GTID:2178360278457600

Subject:Computer application technology

Abstract/Summary:

In recent years, XML (Extensible Markup Language) has become new standard of data representation as well as data exchange on the Internet.Though with sound achievements of XML research, there are in theories and performances still many difficulties for XML query technology because of its inherent characteristics. With a profound research in XML query technology, this dissertation by analyzing and summaring the current research, development and application analyzes detailed the optimization of XML query in the following aspects: XML data model, the memory of XML data in data-base, the analyzing of XML data and process method of query. At present, a variety of XML data query methods had been put forward, but inadequate consideration still exist in the complex XML data distribution, which brings about low performance efficiency. This article, which elaborates the query estimation technology in detail from both one-dimensional and multi-dimensional aspects and takes XML characteristics into consideration, proposes using mulit-dimental histogram to count XML dates in order to simplify performance.The value distribution of XML involves not only the distribution of other values but also the structural information of XML which will lead to multi-dimensional dependent element set if structural information itself is complex. In that case, storage and error rate will raise a lot. Therefore, this paper, using discrete cosine conversion methods (DCT) to deal with XML data, expands the DCT to high-dimensional model basing on the high correlation of XML data the expansion of, which brings about a high-dimensional DCT equation. Such an algorithm proves to be efficient in reducing both the error in statistics and processing time and memory.A proposal of certain method needs careful and comprehensive experiment validation. In the experiment, all data are generated in the (0,l)n normalized data space. Besides, synthetic datas are generated with 50K records which ranged from 2 to 10 dimensions. We generate data with various distributions (1) Normal distribution; (2) Zipf distribution; (3) Clustered distribution to verify (1) The Storage Requirements and Selectivity Estimation Time;(2) Effect of Dimension and Query Size;(3) Effect of Data Distributions. Extensive experiments showed the proposed method is superior to the previous ones with the following advantages:1) The previous methods could not support multi-dimensional selectivity estimation, particularly, more than three dimensions. But our method supports high dimensional selectivity estimation with high accuracy.2) Our method can save time and space.3) Our method eliminates the periodical reconstruction of the statistics for estimating the selectivity because it can reflect dynamic data updates to the statistics immediately.4) Our method simply calculates the selectivity using the integral of cosine functions. It also calculates the estimation accurately because it naturally supports the interpolation between the adjacent buckets.

Keywords/Search Tags:

XML, Result Size Estimation, Discrete Cosine Transform

Related items

1	Research And Hardware Design Of Discrete Cosine Transform
2	Resarch On Recursive Imdct Algorithms And Design Of An Audio DSP Core
3	Research On Image Watermarking Algorithm Based On Discrete Cosine Transform And Wavelet Domain
4	Design And Research Of Discrete Cosine Transform IP Croe
5	Variable Step-size LMS Algorithm Based On Discrete Cosine Transform For Noise Cancellation
6	Design And Implementation Of Discrete Cosine Transform
7	The Hardware Configuration Of Discrete Cosine Transform Based On Distributed Algorithms
8	VLSI Design For Discrete Cosine Transform
9	Research On Three Dimession Video Compression Based Discrete Cosine Transform And Wavelet Transform
10	Image Encryption Based On A Reality-preserving Fractional Discrete Cosine Transform And Chaos