Font Size: a A A

Data warehouse schema evolution with extended hierarchy semantics

Posted on:2008-10-04Degree:Ph.DType:Dissertation
University:University of CincinnatiCandidate:Banerjee, SandiptoFull Text:PDF
GTID:1448390005954462Subject:Computer Science
Abstract/Summary:
A data warehouse technology provides a way to visualize data in a form that helps in the decision making process. This visualization is provided by a multi-dimensional schema that captures the user needs in terms of data content, constraints on data, and views of data. User requirements can change over a period of time and this causes the schema to be redesigned from scratch. Redesigning a schema is an expensive process in terms of resources and time. A solution to this problem is designing a schema evolution process that helps a schema to evolve. Models for conceptual design of data warehouse schemas have been proposed by researchers, but none have provided a formal definition of semantics to support schema evolution.; As a first step to the schema evolution process, we introduce a formal model that captures the conceptual modeling features of a data warehouse. The conceptual modeling features are categorized as core and advanced features. Previously authors have defined the core features to capture simple semantic information of a multi-dimensional schema. In our work we extend the semantics of our model to represent complex information by defining advanced features such as non-strict, non-onto, non-covering and multiple hierarchies. Model constraints are defined to maintain integrity when a schema evolves over a period of time.; In the second step we design schema evolution operators that help to make changes to a multi-dimensional schema. To visualize these two steps we implement a software tool that helps to create semantically correct schemas and understand the impact of evolution over a schema with extended semantics via stored procedures and triggers for integrity enforcement. We further extend the formal model to support instance data and map the relationship between these instances in a lattice framework. When a schema evolves, the aggregations of the new instance data can be derived from the aggregation of instances of the original schema thus saving resources and improving the decision making capability of a data warehouse.
Keywords/Search Tags:Data warehouse, Schema, Semantics, Process
Related items