Font Size: a A A

The Design And Implementation Of Web Log Collecting And Data Analysis

Posted on:2016-09-15Degree:MasterType:Thesis
Country:ChinaCandidate:T S ZhangFull Text:PDF
GTID:2308330476452772Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As big data become very hot nowadays, more and more companies are also interested in big data. They want to find some patterns to give high level managers dicision support from user data. Web Telemetry so becomes more and more popular and helps many companies to improve their products quality.This thesis described Web Telemetry data collecting and analysis, describled collecting data and tracking user behavior on web page by JavaScript API, aslo describled how to process and analyze data and finally provided useful suggestions for enterprise decision making. This thesis describled the key technology of data collecting and data analysis in the system, analyzed the system requirement, aslo describled the system design. This thesis also described ①the goal of Telemetry system and the design of Telemetry API, ②for our system also runs on Mobile devices and web pages not refresh frequently, to save the network traffic and improve system performance, we use hidden iframe to post data instread of hidden gif, ③for the log size is very large reason, we will use multiple log server and load balance to handle log and save it to Microsoft Azure, file path consists of machine name and time ticket, we will generate a new file every 5 minutes, ④we use distrubite system to process the text log to structured log, and daily run script to generae some regular reports like daily users/sessions, etc., and then user SQL SERVER Reporting Service to show the report, ⑤we use distribute system to aggregate the data and pump it to data base, then do data mining. The system has been test passed and runs well after development, and we already got reports based on the Telemetry data and provided useful suggestions for enterprise decision making, so this system is workable and valid.In contrast to these system, this thesis has the following characteristics:1. Raw log data is downloadable. We can get all the customized reports and do data mining base on the raw log data.2. This system use hidden iframe to post data, the advantage of hidden iframe is there is no data size limitation for post, so we don’t have to submit data frequently and save network traffic for these websites that use Ajax for interaction.
Keywords/Search Tags:Telemetry, data analysis, data mining
PDF Full Text Request
Related items