Font Size: a A A

Efficient storage and query processing of set-valued attributes

Posted on:2002-09-02Degree:Ph.DType:Dissertation
University:The University of Wisconsin - MadisonCandidate:Ramasamy, KarthikeyanFull Text:PDF
GTID:1460390011992035Subject:Computer Science
Abstract/Summary:
In order to better support complex applications, object relational systems provide features that are absent in relational systems. The main features a new type by providing collections of existing types. It is well known that sets are useful in modeling a great deal of real world data. However, such powerful modeling comes at a price; without an efficient implementation, using sets can yield a performance much worse than that obtained using only traditional relational constructs. This dissertation explores novel ways of implementing set-valued attributes in an object relational system. Specifically, it considers various options for storing set-valued attributes, and ways of computing the challenging set containment join operation.; We first address the problem of storing set-valued attributes. Using the orthogonal attributes of nesting and location we identify four options for representing sets: nested internal and external, and unnested external and internal. These representations can be combined with the creation of various indices to create various classes of indexed representations. We evaluate each of these representations with respect to conjunctive and disjunctive queries. Our results show that overall the nested implementations perform better than the unnested implementations because: (a) they exploit grouping semantics while fetching the members of a set instance and (b) they allow the evaluation of set predicates directly on the set instance.; Next we consider the problem of efficiently evaluating set containment joins. For unnested external representation, the set containment join can be expressed directly in SQL. By contrast, the most obvious algorithm for computing set containment joins on nested representations is the signature nested loops algorithm, which computes set signatures and compares each signature in a relation with all the signatures in the other relation. To improve on the performance of this algorithm we propose a new partitioned set join algorithm (PSJ), which uses a multi-level scheme of partitioning by replicating the inner relation. Our performance study shows that for extremely small relation and small set cardinalities, the SQL query approach and signature nested loops perform comparably to PSJ. However, as the size of the data sets increase (in both relation and set cardinality), PSJ clearly dominates.
Keywords/Search Tags:Set-valued attributes, Relation, PSJ, Set containment, Sets
Related items