Font Size: a A A

Design And Implementation Of Enterprise User Data Analysis Platform Based On Big Data

Posted on:2020-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhangFull Text:PDF
GTID:2428330575998496Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of the Internet,more and more people are using the Internet for their daily work and life,so almost all industries are more or less affected by big data.Nowadays,Internet technology has begun to affect the development of various industries,and has become an essential element of each processing unit.Through big data technology,helping companies manage and analyze massive amounts of fragmented data can not only enable companies to keep up with the ever-changing trend,but also have the ability to predict future trends,enabling companies to take a more competitive position.At present,most of the data analysis systems on the market are based on Hive data warehouse and Spark analysis engine for underlying data calculation,which can ensure that the required data metrics are analyzed in the shortest possible time while having a large amount of data.However,with the increase in data volume and query requirements,the use of Spark as the underlying computing engine has been unable to meet the system user's need for query speed.Most of the business scenarios of the data analysis system described in this article are single-table queries.For single-table queries,ClickHouse's query speed is much faster than Spark,so the data analysis system described in this article uses ClickHouse as the underlying data calculation.engine.The enterprise user data analysis platform described in this paper adopts MVC design pattern,uses Java language,Spring MVC,MyBatis development system,uses Hive,Spark,ClickHouse as the underlying data analysis engine,mainly has two modules:data analysis module and background management module.Among them,the author independently completed the design and development of all sub-modules of the data analysis module in the system development process,and participated in the development of user grouping strategy and data analysis engine selection,and participated in the construction of ClickHouse cluster.Background Management Module:It is mainly used by system administrators to manage the system,including console management and metadata management.It is mainly used for administrators to manage the data sources used in the system data analysis and the data in the database.Provide support for data analysis.The system uses big data processing tools for unified management of user fragmentation data and user analysis,and has passed system testing and online,building a fast and accurate data analysis platform for enterprises to achieve user preference analysis and precise marketing strategies.
Keywords/Search Tags:Data analysis, Hive, Spark, ClickHouse
PDF Full Text Request
Related items