Font Size: a A A

Design And Development Of Large Data Processing System For Electronic Business Flow

Posted on:2017-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y K WangFull Text:PDF
GTID:2348330512464995Subject:Control engineering
Abstract/Summary:PDF Full Text Request
The development of computer technology and the increasing popularity of the Internet make the rapid development of e-commerce and the continuous emergence of enterprises,which lead to the rapid increase of the multi-source flow data of the electronic business website.Moreover,the traditional electronic business flow data processing method that used to store data is limited and the number of supported data source is very small.So we urgent need to introduce modern business flow data processing system to meet the needs of enterprise data processing.In this paper,a electronic business data processing system is designed and developed based on Hadoop technology and Elasticsearch Technology.The system realizes the rapid extraction,transformation,loading and management of the original data of the massive electronic business.Based on these function,the page hits,number of independent visitors,page conversion rate,page jump out rate,export rate,and other flow indicators of the electronic business website can be obtained.The main work of this paper is as follows:Firstly,on the basis of literature review,this paper summarizes the research background and significance,as well as the current situation of domestic and foreign business flow data processing.In addition,the search engine Lucene Elasticsearch,distributed storage technology,ETL technology in the distributed file storage system HDFS,programming model Map Reduce,the open source framework Spring MVC,as well as Spring and Mybatis Technology are introduced.Secondly,the market demand,social and economic benefits,and functional requirements are analyzed,and the detailed demand of the original data ETL module and flow statistics service module that based on the original data are also analyzed.According to the analysis,the four function module of the system,i.e.,data import process,data ETL,data management and statistical parameters are designed.Moreover,the functions of each module are also introduced in details.Thirdly,the main body of the original data ETL system uses the distributed file system HDFS in the Hadoop cluster to store the data.The Map Reduce programming model is used to process the data of flow and it supports the Oracle,My Sql,Postgre Sql and other relational database,data interface and other types of the original data source.Fourthly,the flow statistics service system uses the Elasticsearch cluster as the data source,and it can be expended according to need.The system uses the Oracle database to storage the system conventional data,and the system background is built by using SSM framework.In addition,Bootstrap,Jquery,Ajax and other technologies are employed for front-end development.Finally,the original data ETL system achieves the extraction,processing and management for the original flow data,as well as the import of the processed flow data into the Elasticsearch cluster.The flow statistics service system achieves the statistic of various flow parameters,personnel management,flow alarm,the regularly send of email,and some regular functions based on flow data in the Elasticsearch cluster and the conventional data in Oracle database.At present,the electronic business flow data processing system has been operating in a famous large electronic business enterprise,and the system runs stable,responses well.
Keywords/Search Tags:Hadoop, Elasticsearch, data processing, electronic business flow
PDF Full Text Request
Related items