Font Size: a A A

Design And Implemention Of WEB Log Analysis System Based On Cloud Computing

Posted on:2015-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:J L XiaoFull Text:PDF
GTID:2308330473452672Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of network technology and e-commerce technology, applications and services running on the Internet are increasing. WEB has become the largest information system. As an important part of the WEB system, WEB log is used to save records that people browse. With the dramatic increase in the number of people using the WEB system, the number of WEB log is growing rapidly. How to extract meaningful information quickly from a massive WEB log has become an important research topic of the business community and academia.Many WEB log analysis systems have been designed and implemented. Most of them are based on a stand-alone server. The performance of a stand-alone server, such as CPU, I/O, and the storage, is very limited. Therefore, the systems based on a stand-alone system are far from satisfying WEB log analysis of real-time requirements for massive WEB log data.To address this issue, we designed and implemented a WEB log analysis system based on cloud computing in this paper. As a typical application framework of cloud computing, Hadoop can use multiple machines for improving distributed computing capabilities, and supports distributed storage and parallel access technology. Therefore, we study and implement WEB log analysis system based on Hadoop framework.Specifically, the major work of this paper includes the following:First, in-depth study and master the key technologies of Hadoop, including HDFS(distributed file system) and Map/Reduce(distributed computing framework).Secondly, study how to use cloud computing technology to optimize the traditional data analysis and data mining algorithms, aiming at designing parallel algorithms based on MapReduce to enhance the ability of the system to process huge amount of data. This paper mainly achieved parallel statistical algorithm and querying algorithms.Finally, design and implement a WEB log analysis system based on cloud computing, including the log collection system module, log preprocessing module, log storage module, log statistics module and log query module.
Keywords/Search Tags:Fedora system, digital resource, digital library, digital object
PDF Full Text Request
Related items