Font Size: a A A

Log Analysis And Proessiong System Based On Spark

Posted on:2017-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:J DongFull Text:PDF
GTID:2348330536976734Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Web log analysis is a collection of all log information generated by the user when browsing the web page.At the same time,it is the process of data conversion,data cleaning and data mining.Through Web log analysis can be found in the user's access behavior and rules,and accordingly optimize the structure of the site,to give users a better experience.But the traditional log data analysis processing is mostly based on serial processing,in the face of massive big data,the traditional log data analysis processing appears to be inadequate.Spark is the most active,most popular and most efficient big data general-purpose computing platform,which is a powerful tool to deal with large scale log data analysis and processing.This paper focuses on the research and development of log data analysis and processing based on parallel computing.Large data parallel processing method based on Spark platform,a new type of log data processing architecture based on Spark platform,a large scale log file analysis algorithm based on Spark platform and Scala language is studied.On the basis of theory and technology research,a large data processing platform based on Spark is built,and the Web log file data set is collected by the search engine,which is stored in the HDFS distributed file system.Application of Scala language coding to achieve a large-scale log file analysis system.The system mainly includes log collection,log storage,log analysis and data display module.Open source application log flume collection system to be analyzed the data into the distributed file system HDFS to store;the spark of RDD based memory computing technology for parallel computing for analysis of log data,greatly improving the speed of execution of the system log analysis.
Keywords/Search Tags:Web Log Analysis, Big Data, Spark, Scala
PDF Full Text Request
Related items