Font Size: a A A

Theory And Methods Of Fuzzy, Dynamic Multi-dimensional Data Modeling

Posted on:2007-10-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q B LiuFull Text:PDF
GTID:1118360215470487Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
As an underlying technical foundation which enriched the applications of data warehouse and OLAP techniques, the study of multi-dimensional data model has been acknowledged for its important theoretical and practical value. Dimension, as defined in the multi-dimensional data model, is a very important concept because of its hierarchical structure which allows people to analyze the facts concerned from different granularities. In the existing multi-dimensional data models, the hierarchical structure of dimension is often based on complete partition with clear hierarchy and stable structure. On the other hand, the information which describes a real world object is often incomplete and fuzzy, and the objects may possibly be dynamic and evolutional, thereby it is difficult to build the corresponding analytic dimensional model with clear hierarchy and stable structure. With multi-dimensional data modeling under fuzzy and dynamic conditions as the research goal, this thesis proposed a multi-dimensional data model which supports fuzzy dimension with the corresponding clustering-based dimension construction method, puts forward an online aggregation algorithm by studying the hierarchical sliding window model for continuous data stream, and presents an dynamic multi-dimensional data model for data stream with the relevant online multi-dimensional aggregation algorithm. The main contributions and innovations of this thesis are:1. Proposes the fuzzy multi-dimensional data model based on fuzzy quotient space theory.A fuzzy dimension structural model which supports incomplete partition is obtained by introducing in the fuzzy equivalence relation. The fuzzy dimension proposed here has extended the ordinary concept of dimension mainly in two aspects: firstly, it extends the aggregative relation ?λbetween two dimensional levels, and supports the parametric aggregation operation based onλ; secondly, it establishes the aggregative relationλwithin a level, and supports stepwise hierarchical aggregation. This kind of extension is also comprehensive, i.e., ordinary dimension can be taken as a special case of fuzzy dimension.Formal descriptions of the fuzzy multi-dimensional data model, fuzzy data cube, and some elemental OLAP operations such as drilling up, drilling down, selection, projection, slicing etc, is also presented in this thesis based on the concept of fuzzy dimension.Through an in-depth analysis of the imprecise aggregation problem using theories and methods in fuzzy granular computing, three processing methods which are conservative method, optimism method, and element-derived set method, are proposed. Compared with other related works, the presented fuzzy multi-dimensional data model, which is based on the solid ground of fuzzy quotient space theory, breaks the limitations of traditional multi-dimensional data modeling theory, strengthens the capabilities of description and modeling for uncertain and fuzzy multi-dimensional data analysis.2. Puts forward the clustering-based construction method of fuzzy dimension.To overcome the difficulties of determining the fuzzy equivalence relation, this thesis proposes two approaches for fuzzy dimension construction accord to different scales of the objects set: method based on fuzzy clustering and method on relative density clustering. Meanwhile, clustering algorithm based on relative density is also proposed, which can produce relatively stable clustering result under different parameters, or to say, the clustering results are not be too sensitive to the parameters. High-density clusters can also be identified from the connected low-density clusters, and thus the clustering results of multi-density can be gained.3. Proposes the multi-level sliding window model of data stream and the online aggregation algorithm.Generally in the processing of data stream, more detailed information is needed on the recent period of time than that from time interval far away. From this point of view, a multi-hierarchical time windows model is proposed to support the description of data stream at different time periods with multiple granularities. Multi-granularity aggregate tree data structure and pyramidal snapshots storage structure for expired data are also designed. Through performance analysis it can be seen that those designed structure suffices the rigorous requirements of the online aggregation and the query analysis of data stream whether considering the storage space or the processing time. In order to query the aggregations of data stream effectively at limited space-time expense, online aggregation methods and approximated query algorithms are also proposed.4. Dynamic multi-dimensional data model for data stream is proposed together with the correspondent online multi-dimensional aggregation methods.Multi-dimensional data model for the online analyzing and processing of data stream, is proposed based on time dimensional patterns of multi-hierarchical time windows model. Compared with ordinary multi-dimensional data model of data warehouse, the proposed one for data stream is advantaged in that it supports the infinite span of the time dimension and the continuous changes of datasets. The infinite span of time dimension makes it difficult for any storage system to preserve all the data in the whole time domain, thus it is an inevitable choice to model the time dimension of data stream with the multi-hierarchical time windows model. The rapid and continuous changes of data determine that a reasonable model should support the online multi-dimensional data aggregation.The observed properties of data stream have the features such as representative, technical, supporting details and so on, it is very difficult to construct and select the dimensions in the multi-dimensional online analysis processing of the data stream. This thesis presents the online clustering algorithm which supports the dynamic dimensional modeling of the data stream, and designs a data structure which supports the online clustering and multi-dimensional aggregation of the data stream, and proposes the online aggregation and materialized method of the basic units of the data stream.The research on the fuzzy and dynamic multi-dimensional data modeling of this thesis has the theoretical and practical significance for promoting the close integration and the wider use of data warehouse, OLAP, and data mining.
Keywords/Search Tags:OLAP, Multi-dimension Data Model, Fuzzy Multi-dimensional Data Modeling, Data Stream, Dynamic Multi-dimensional Data Modeling, Data Mining
PDF Full Text Request
Related items