Font Size: a A A

Design and implementation of an institutional data warehouse from a decision support perspective

Posted on:2009-03-20Degree:M.ScType:Thesis
University:Universite du Quebec a Trois-Rivieres (Canada)Candidate:Denis, Marie-ChantalFull Text:PDF
GTID:2448390002499198Subject:Mathematics
Abstract/Summary:
The following project has consisted in the design and implementation of a methodology for the realization of a data warehouse targeted for a university business environment. The principle objective of such an endeavor has been to empower decision makers via the use of dashboard technology. More specifically this project, based on both a theoretical and a particularly strong practical background, has been specifically within the context of better managing university student clientele.;The early phase of our project consisted of evaluating various data warehousing design methodologies. In fact, the methodology we have developed is based on Ralph Kimbal's methodology which is based on the principle of realizing dimensional models which are end-to-end business process and subject-oriented. Hence, the design methodology we shall propose herein aims at encouraging scalability as the project continues to evolve and new heterogeneous data sources become available to end-users. Subsequently, a detailed analysis of existing commercial and open-source data warehouse and business intelligence products was carried out with the aim of eventually recommending a suitable solution at UQTR for meeting the business requirements. More specifically, two types of tools were deemed necessary for the project: a back-end extraction, transformation and loading tool (ETC) and a front-end business intelligence framework for presenting data and information to end-users via the Web. Finally, we have proposed a framework for managing table definition and structure changes which ultimately aims at empowering the user with richer meta-data. This can be particularly helpful for end-users by providing helpful hints on missing values which inherently result from data structure changes.;Once the systems were integrated within our data warehouse environment, the data was made available to users either in the form of reports using various data formats (i.e. web report, MS-Excel, etc.) or OLAP analysis. Eventually, once the data warehouse shall have sufficiently evolved, future plans are underway to make data mining technology available to end-users.;One of the key challenges that has arisen during our research is how one should manage the evolution and changes to the table structures within our operational systems (OLTP), in order to minimize the impact on the data warehouse. By "evolution" we imply any addition, modification or removal of a table field. We have had to consider how to present to the end-user all these possible table structure changes and how they relate to the presence of missing values within the data warehouse. The objective has been for such biases or anomalies to be properly managed and have a transparent impact on the business end-user. Another difficulty to manage has been how to inform the end-user of events which could have occurred that can have a direct impact on the business value of the data available for a given table field. For instance, the occurrence of a strike by professors at a university could superficially increase the student failure rate. Without such key information, a business analyst would have no means for properly correcting such a situation and correctly interpreting the associated data with such an event (i.e. biased average and standard deviation measures).;During the most demanding phase of the data preparation stage, a major integration problem was observed. The legacy systems did not make use of foreign key constraints and this had a direct impact on the integrity of the data that was migrated to the data warehouse (i.e. relation model to dimensional model transformation). As such, two tools have been implemented in order to manage the integrity and extensibility of table definitions: a data analysis tool and a data definition language (DDL) analysis tool.;Though the implementation of this project shall be simplified by the use of such tools, new challenges shall arise. For instance, it shall become important to consider the "intelligent" management of large masses of data by making use of archiving and re-integration techniques in order to potentially satisfy many business analysis requirements from end-users.
Keywords/Search Tags:Data, Business, Implementation, Project, End-users, Methodology
Related items