Font Size: a A A

On-Line Analysis (OLAP) Of Documents

Posted on:2011-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:W LiuFull Text:PDF
GTID:2178360308473970Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data warehouses and OLAP systems (On-Line Analytical Processing) provide methods and tools for enterprise information system data analysis. But only 20% of the data of a corporate information system may be processed with actual OLAP systems. The rest one, namely 80%, mainly from documents, remains out of reach of OLAP systems due to the lack of adapted tools and processes. In the decision support system, the omittance of data, which contained in files, may lead to inaccurate analysis or erroneous decision. The documents embody a capitalization of knowledge, as well as analysable data in information system (sales, purchases…). So some new techniques should be added in decision support system. Today, a decision maker masters the process OLAP very well. But, a question has been raisen:how to provide an environment for online analysis of 100% of the available data with methods which the decision maker masters?In order to address this problem, we propose a new conceptual multidimensional model. Unlike tranditional multidimensional models that rely on the duality of concepts "Fact/Dimension", proposed model is based on a unique concept "Dimension" to model both the subjects and axis of analysis. The model provides the decision maker with a angle of view of multidimensional elements for analysis.The multidimensional analysis bases on an ability to synthesize informations by aggregating them with some functions. However, there is not the method for aggregating textual data in the environment OLAP. Thus we propose a function capable of aggregating textual data. This function seeks to summarize a set of keywords by a smaller and more general set.To specify analysis on data from documents, operations are introduced for manipulation of model concepts. Initially, these operations allow the specification of a multivariate analysis from the elements represented by the model. In a second step, we define a core of basic operations for modifying an analysis, so that the decision maker can refine their observations and make the best decision possible.Finally, the idea is embodied in a prototype written in Java to validate our proposal. The new multidimensional structures are planted in a DBMS. The results are summarized and returned to the user.
Keywords/Search Tags:Decision Support System (DSS), Data Warehouse, Document Warehouse, Data Marts, Galaxy-Schema, NS2, XML
PDF Full Text Request
Related items