Font Size: a A A

Research And Application Of Map/Reduce Based Distributed Log Analyzer

Posted on:2012-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2218330368497106Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In this paper, a MapReduce-Based Framework is implemented to analyze the distributed log generated in cloud computing. The framework is built on top of Hadoop, an open source distributed file system and MapReduce implementation.We first make use of Random Access File to realize an incremental way for aggregating system logs from each node of the monitored cluster, and collect them to the analysis cluster. Then, we integrate the collected logs. After that, we implement a MapReduce-Based algorithm to parser these clustered log files. Furthermore, in order to make the best use of this collected data, a flexible and powerful way is utilized to display monitoring and analysis results.Besides, we quantitatively evaluate and characterize the Hadoop framework through I/O extensive benchmarking, so as to optimize the performance and understand the tradeoffs of system designs for the MapReduce-based data analysis using Hadoop.First, we characterize and evaluate workload performance of I/O intensive benchmarking with different underlying software choices, both on I/O schedulers and native filesystems.Then, we provide some potential enhanced solutions to optimize performance of Hadoop benchmarking, and conclude our experiments in the end.
Keywords/Search Tags:Distributed system, Map Reduce, Hadoop, Performance Optimization, I/O Scheduler, Filesystem
PDF Full Text Request
Related items