Font Size: a A A

Design And Implementation Of Multidimensional Data Analysis System For Media Big Data

Posted on:2019-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:J W ChenFull Text:PDF
GTID:2428330563458477Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advent of the big data era,news client will generate a large number of user behavior data and business data every day.How to process data effectively and sense the meaning of these data becomes the issue concerned by enterprises.Growing business data put the traditional data analysis systems under a lot of pressure,which involves insufficient data reading and writing performance and poor query efficiency.To solve these problems,this thesis designs and implements a multidimensional data analysis system for news media big data based on the Hadoop platform.This thesis designs and implements a news client oriented multidimensional data analysis system based on analyzing the current situation of OLAP system at home and abroad and investigating the existing multidimensional data system thoroughly.The multi-dimensional data analysis system architecture is on the Hadoop platform,and the front-end adopts the Bootstrap framework,and the back-end adopts Hive and NoSQL.According to the characteristics of the storage format Parquet in Hive,this thesis designs the Star Schema which store in the Hive Database by linking the fact table and the multi-dimensional table to improve the scalability of the multi-dimensional data model.At the same time,it is stored in Impala and Kylin according to the data level of the Hive to improve the query efficiency.The HQL query statement based on the MapReduce computing framework provided by Hive improves the efficiency of data extraction,conversion and loading.Next,this thesis accomplishes the design and the implementation of the whole system including data management,multidimensional data analysis and so on.In the end,this thesis proposes a summary and prospect of this research.This thesis aims at analyzing user data of news client multi-dimensionally,as far as possible to explore data from different dimensions which makes business personnel understand the status of users timely to provide users with better service,provide accurate comprehensive data support for the long-term development and decision making of the product;On the other hand,this thesis manage to provide a great convenience which realizes complex data query by simply dragging controls for users who are no SQL based.
Keywords/Search Tags:Multidimensional data analysis, OLAP, Data warehouse, Hadoop
PDF Full Text Request
Related items