Font Size: a A A

The Design And Implementation Of Log Analysis System In Cloud Computing Environment

Posted on:2014-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:P WangFull Text:PDF
GTID:2298330467963038Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the information era, the information systems have gradually become popular in the various aspects of daily life. Also the network and daily life is connected more. Every operation in the system will write a record into the log system. In the log records, we can gain the operation and condition information of the system. The information is very important for the normal operation and quality improvement of the system.With the development of Internet and mobile Internet, the scale of data is showing explosive growth trend. Then the storage and analysis of batch data become a challenge. Cloud computing provides an effective way to solve the problem of massive data storage and processing, through the integration of the cluster storage and computing ability.Hadoop introduces a way to resolve the problem of massive data storage and processing through cluster storage and computing power. This is beyond the ability of the traditional stand-alone mode because of the limitation of memory and computing power. But for a long time, there only exists Map/Reduce framework in Hadoop which is unsuitable for application requirement. At the same time, different frameworks have to be deployed in different distributed cluster due to the lack of unified resources manager. YARN provides a solution for that problem and enables a variety of computing framework to share a single cluster. YARN can support a variety of different computing framework, playing a role of the resource management in the cluster. In this paper, a log analysis system in cloud computing environment with the ability of large-scale data processing and variety of computing power based on Hadoop, Spark and YARN is put forward. The main work in the paper includes:Firstly, the research of the distributed system of Hadoop, Spark and YARN is conducted. Then, the basic principle and theory and the applicable application scenario are obtained. With the deep exploration of YARN, the computing platform will be built which can support various frameworks and provide a variety of computing power.Secondly, based on the cloud platform, a log analysis system with the ability of large-scale data storage and log cleaning process is designed. Then the system is implemented with the combination of Map/Reduce and Spark.Finally, the ability and performance of cloud platform and log analysis system are tested. The results show that the unified cloud platform with multiple frameworks provides a variety of computing power and also dramatically improves the performance, with the feature of data sharing and the efficient use of resources. And the log analysis system design scheme is feasible, effective and practically significant.
Keywords/Search Tags:log analysis, distributed computing, Hadoop, Spark, YARN
PDF Full Text Request
Related items