Font Size: a A A

Design And Implementation Of Behavior Data Collection And Statistics System Of Web User

Posted on:2016-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:R YangFull Text:PDF
GTID:2308330470455835Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advent of the Internet Era, the network has been integrated into people’s lives. People have gradually accepted the consumption pattern of online shopping. As online consumers have increased dramatically, all e-commerce websites begin to invest more costs to attract users in order to create more revenue. Since it is the e-commerce website, a better website design and a satisfying shopping experience are crucial for the operation of a website; therefore, website analysis becomes necessary. In order to know how users access to the website, operators of e-ecommerce websites have to collect comprehensive and detailed users browsing behavior data. In terms of the big data, huge amounts of information make website analysis more insightful. Perhaps it will dig potential value from the inconspicuous data.Although there are a lot of third-party tools or even free website analysis tools, the practical application on the site is not convenient, for example, Google Analytics uses JavaScript Page Label method to gather data. It must modify web pages to import the JavaScript code. Besides, capturing a kind of users’ behavior data still needs to modify a lot of pages to add event tracking code. The results of the procedure are that the workload of data capture is heavy and tracking code management is inconvenient. However, Web Logs as another method for collecting behavior data cannot do event tracking and must filter the data. The focus of this paper is to implement a behavior data collection and statistics system, which uses JavaScript Page Label method to gather data but do not need to modify the page content manually. It embeds different JavaScript into various web pages automatically by using Nginx module function. It makes the management of event tracking JavaScript unified and convenient. The data collection server is based on Netty, and it can handle a large amount data quickly. Through the data collection server, the behavior data are sent to the MetaQ message middleware. There are two ways to do data statistic respectively, one of these is using the Hive to generate customized report, and the other is through the Storm to implement real-time statistics. Therefore, the two kinds of statistical methods can pull data message independently from MetaQ middleware, which will make data collection server decoupled from statistic.In this project, the author’s main work includes researching the method of capturing user behavior data, and the implementation of data capture, data collection and storage modules. The statistics part that the author participates in is using Hive to implement, so the implementation of real-time statistics through Storm will not be described in this paper. Currently, this system has provided data statistic service for China Unicom Online Mall and Mobile Mall. With the help of existing task schedule system, reports can be generated and sent to relevant people daily or periodically. Under the current situation, the data storage in HDFS achieves real-time basically. Therefore, through the real-time query of behavior data, the status of website can be monitored. If there are abnormal situation on website, the system will send short warning messages to developers.
Keywords/Search Tags:Web Analytics, Behavior Data Collection, JavaScript EmbeddedAutomatically, Netty
PDF Full Text Request
Related items