Font Size: a A A

Design And Implementation Of The Mass Data Analysis System Based On Hadoop

Posted on:2014-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2268330425477628Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Witnessing today’s rapid development of E-commerce, E-commerce companies are making an attempt to analyze the log files generated by user login which contributes to the user characteristics. These findings will help them arrange the order of pages and advertising plan more efficiently.With the process of analyzing GB or even TB size of logs, the type of traditional stand-alone database can not catch up with the pace of data growth gradually. While the distributed database is increasingly mature in parallel information processing manner, which shows its exclusive processing efficiency in the demand of processing Big Data. Hadoop developed by the Apache Foundation is undoubtedly one of the most visible software with satisfactory speed of data processing in TB or even PB size.Targeting at the E-commerce companies’ demand of analyzing enormous users’ accessing logs, this paper designs a Mass Data Analysis Systems based on Hadoop by applying Hadoop platform and related technologies. By operating the system on the server, logs with size of hundreds GB and even TB can be analyzed. Then it will contribute to the analysis of user data that concerns E-commerce companies such as the sources of users and the analysis of page flow. Finally the system can automatically display a straightforward result by generating charts which results in a more appropriate business application.This paper first describes the project background and Hadoop related technologies. Focused on demand analysis, the system proposes specific solutions and applied technology for each part of the function. Finally the overall system implementation and testing is presented.
Keywords/Search Tags:Hadoop, Mass Data, Page Flow
PDF Full Text Request
Related items