The big data multi-dimensional analysis platform aims to observe and mine massive data from multiple angles and sides.After professional integration and analysis,it finally outputs visual data or charts to help analysts and business users understand the information contained in the data.Facing the explosive growth of data volume and analysis requirements,this article uses MOLAP pre-calculation technology to break through the performance bottleneck of the traditional ROLAP platform,but its application has the following problems and challenges:1)In the application of pre-calculation technology,the construction and optimization of multi-dimensional data models are dependent on data experts.When the scale of data continues to increase and data analysis needs change,it will consume a lot of manpower;2)For traditional multi-dimensional model optimization algorithms,There are problems of dimensional disasters in ultra-high dimensionality due to single evaluation indicators,and frequent jittering of materialized view sets;3)In a system based on hybrid engine,the ROLAP and MOLAP engines have their own strengths.It is difficult for the system to make a fast and reasonable choice between them.A multi-dimensional model index is needed for query routing.In response to the above problems and requirements,this paper conducted research and analysis on technologies such as big data multi-dimensional analysis,pre-computation and multi-dimensional data indexing.The main research contents are as follows:(1)Researched and realized the automatic construction and continuous optimization technology of multi-dimensional data model.Extracting metadata by analyzing historical query tasks,automatically learning related knowledge between data in the background,building related views of data tables to materialize multi-dimensional data models,opening up the path of "raw data-pre-calculation-data analysis".The multidimensional data model is monitored and optimized in its whole life cycle,which makes the use of MOLAP more convenient and intelligent.(2)Proposed and implemented a multi-dimensional big data model optimization algorithm based on weighted graphs.The algorithm introduces new evaluation indicators:average query latency and expansion rate,which truly weighs query performance and storage space,and solves the hidden dangers of dimensional disasters.And divide the aggregation group by mining the associated information between the dimensions,so that the data model can adapt to the needs of exploratory analysis,and reduce the frequent jitter of the materialized view set.(3)Researched and realized multi-dimensional data query technology based on hybrid engine.A bitmap index based on Cube spanning tree is proposed,the retrieval method of this bitmap index and the overall query routing strategy are given to solve the problem of query engine selection.This kind of bitmap index fits the structure of the multi-dimensional data model,and has a small footprint and fast bit operation speed,providing an efficient indexing solution for query routing of hybrid engines.Finally,based on the above three aspects of research,design and implement a big data multi-dimensional modeling analysis platform,which is applied to the national key research and development project "Science and technology consulting technology and service platform research and development based on big data",which verifies the effectiveness and practicality of the platform and method. |