Font Size: a A A

Design And Implementation Of Offline Dataanalysis Platform Based On Hadoop

Posted on:2019-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhuFull Text:PDF
GTID:2428330596455357Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of e-commerce,log data generated by users is increasing day by day.However,there are a lot of noises,inconsistencies and even garbage data in the original mass data.It is necessary to clean,filter and parse the data in order to condense the valuable information.To solve the above problems,this paper designs and implements an off-line data analysis platform based on Hadoop,and implements the data collection module,data analysis module and data display module of the off-line data analysis platform.Data collection module collects user data mainly by compiling JavaScript files to trigger user-defined operation behavior;Data analysis module mainly uses MapReduce programming model and Hive script in Hadoop technology to analyze and process data,and calculates data according to eight analysis angles defined by user,so as to achieve the main function of data analysis.The data display module mainly displays the parsed data in the platform page by combining the parsed data with the open source HighCharts chart,which facilitates the exploration of the results of data parsing.Thus,through the use of the platform built in this paper,e-commerce websites can draw the following conclusions: whether there is user churn,the proportion of gender or age in members,users' preferences for browsers and mobile phone systems,the regional distribution of members' orders,whether web pages have fewer browsing levels due to user experience discomfort,festivals or marketing.The proportion of activity sales and the comparison of orders,etc.Through the exploration of these problems,it can reflect the current development of e-commerce websites,promote website staff to allocate website resources reasonably,and maximize the benefits as far as possible,so as to create new business value.
Keywords/Search Tags:Data analysis, Hadoop, Log parsing, MapReduce, Hive
PDF Full Text Request
Related items