Font Size: a A A

Analysis Of Airline Ticket Settlement Data Based On Multidimensional Data Model

Posted on:2018-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:S QianFull Text:PDF
GTID:2359330533960209Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the vigorous development of China's civil aviation industry,more and more passengers choose aircraft as a travel tool.As airline passenger traffic is increasing rapidly,the ticket settlement data is also growing explosively.Long-term data accumulation makes the ticket settlement data both multidimensional and large.At the same time,it is a great challenge for the traditional Business Intelligence(BI)system to analysis the ticket settlement data.Therefore,the construction of multi-dimensional data cube and using the latest distributed computing technology to speed up the ticket settlement data query and analysis are of great significance.Aiming at the performance problem of iceberg cube computation in the BI system,Dynamic Pruning based Bottom-Up Computation with Bitmap Index(DPBUC_BI)algorithm is proposed in this paper.The feature of bitmap index like organizing the data by columns is used to redefine the method of partition in the Bottom-Up Computation(BUC)algorithm,which accelerates the loading and query of data.The performance of the computation is improved with the aggregation computation implemented by logical bitwise computation.Since a large number of ticket settlement data focus on several dimensionalities,dynamic pruning strategy is used in the DPBUC_BI algorithm,which optimizes the performance of algorithm within the similar computational accuracy.The experimental results show that the performance of iceberg computation in the data of air ticket settlement has good computational performance and less time consumption in DPBUC_BI algorithm than that in the traditional BUC algorithm.In order to store and multi-dimensional analyze the massive ticket settlement data well,the ticket settlement analysis platform is constructed by a distributed computing framework in this paper.Based on completing the data migration with Flume and Sqoop,the ticket settlement data warehouse of fact constellation is designed and the characteristics of ROC and Parquet storage formats are compared.Aiming at the problem of reducing the spaceoccupancy of bitmap index,the Enhanced word alignment Hybrid(EWAH)algorithm is used to compress the bitmap index.Finally,multi-dimensional aggregation algorithm and association rules mining algorithm parallelization are implemented based on MapReduce model.The experiments show that the distributed ticket settlement analysis platform can not only complete the simple statistical analysis quickly,but also run the parallel association rule mining algorithm very well.
Keywords/Search Tags:Ticket Settlement Data, Warehouse, Iceberg Cube, Bitmap Index, Dynamic Pruning, Data mining
PDF Full Text Request
Related items