Font Size: a A A

Indexing of multidimensional discrete data spaces and hybrid extensions

Posted on:2010-11-30Degree:Ph.DType:Thesis
University:Michigan State UniversityCandidate:Chen, ChangqingFull Text:PDF
GTID:2448390002475642Subject:Engineering
Abstract/Summary:
In this thesis various indexing techniques are developed and evaluated to support efficient queries in different vector data spaces.;Various indexing techniques have been introduced for the (ordered) Continuous Data Space (CDS) and the Non-ordered Discrete Data Space (NDDS). All these techniques rely on special properties of the CDS or the NDDS to optimize data accesses and storage in their corresponding structures.;Besides conventional exact match queries, the similarity queries and the box queries are two types of fundamental operations widely supported by modern indexing techniques. A box query is different from a similarity query in that the box query in multidimensional spaces tries to look up indexed data which meet query conditions on each and every dimension. The difference between similarity queries and box queries suggests that indexing techniques which work well for similarity queries may not necessarily support efficient box queries. In this thesis, we propose the BoND-tree, a new indexing technique designed for supporting box queries in an NDDS. Both our theoretical analysis and experimental results demonstrate that the new heuristics proposed for the BoND-tree improve the performance of box queries in an NDDS significantly.;The Hybrid Data Space (HDS) is a multidimensional data space which contains both (ordered) continuous and non-ordered discrete dimensions. In this thesis a novel indexing structure, the C-ND tree, has been developed to support efficient similarity queries in HDSs. To do so, some geometric concepts in the HDS are introduced. Novel node splitting heuristics which exploit characteristics of both CDS and NDDS are proposed. Our extensive experimental results show that the C-ND tree is quite promising in supporting similarity queries in HDSs.;To support box queries in the HDS, we extended the original ND-tree to the HDS to evaluate the effectiveness of the ND-tree heuristics on supporting box queries in an HDS. A novel power value adjustment strategy is used to make the continuous and discrete dimensions comparable and controllable in the HDS. An estimation model is developed to predict the box query performance of the hybrid indexing. Our experimental results show that the original ND-tree heuristics are effective in supporting box queries in an HDS, and could be further improved with our power adjustment strategies to address the characteristics of the HDS.
Keywords/Search Tags:Data space, Indexing, HDS, Queries, Support efficient, Discrete, NDDS, Hybrid
Related items