Font Size: a A A

The Decision Tree Algorithm In The Application Of The Web Server Log Analysis

Posted on:2012-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:X X JinFull Text:PDF
GTID:2248330371465233Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of the Internet technology, web browsing has emerged to be one of the most important means of information search, knowledge acquisition and recreation. Web sites have expanded their scales in order to provide new services, which have added to the difficulty and the complexity of maintenance. How to leverage the huge amount of data collected during maintenance and provide new functions or identify potential risks for the maintenance personnel defines a new direction of web site maintenance.Data mining is a technology of extracting predictable or describable information from huge amount of data or data warehouse. It can identify potential patterns, extract valuable information, guide commercial behavior or assist scientific research. Applying data mining in web site maintenance enables us to analyze system operation log features such as alarm data, which can help to identify potential problems in the web site, improve maintenance efficiency, shrink maintenance cost and keep the web site robust.This thesis first introduces the UNIX-based web site architecture, the state-of-art and issues in web site maintenance. Based on the analysis of system operation log, the decision tree algorithm is applied to diagnose the potential problems of web sites. We made use of the C4.5 decision tree algorithm, as well as the WEKA utility for data mining and decision tree modeling. Rules in the model are described in sequence to establish a reliable monitoring and diagnosis mechanism. We also designed the prototype including the data mining and log alert module, in which the components and functions are described in detail.
Keywords/Search Tags:Website Maintenance, Data Mining, Decision Tree, C4.5, WEKA
PDF Full Text Request
Related items