Font Size: a A A

Design And Implementation Of Video Logs Analysis System Based On Hadoop

Posted on:2018-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2348330542471944Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,video sites to high-quality services,rich content to win the majority of Internet users love analysis of key indicators such as: the regional distribution,channel distribution,time distribution and so on,you can for the video site operations,resource configuration and advertising delivery to provide strong data support.However,the need to analyze the video log is usually up to TB or even PB,the traditional analysis model due to CPU,memory and other physical conditions are difficult to meet the needs of the constraints,so how to use large data technology to build a video log analysis system needs is particularly urgent.The design and implementation of this topic include the following:1,Using Flume,Kafka to log the collection of logs,the process of collecting log for the log description of the source of the engine room,machine,collection time and other related information,so as to ensure that the log collection with efficient,accurate and stable features.2,For the off-line data stored on HDFS,MapReduce is used to clean,convert,load to the Hive table,aggregate through Hive Sql,and to calculate the log.For real-time data,use Spark Streaming to read the message in Kafka to analyze the log.3,The K-means clustering algorithm based on Canopy is designed and implemented to analyze the behavior of the user.The improved algorithm can effectively solve the problem of K-means algorithm in the initial clustering center selection and the limitation of exception processing.Can be the same interest in the user clustering out for the followup user behavior analysis to provide the basis.4,The system through the Echarts part of the code to improve and expand the increase in the mouse focus prompt function,Echarts rich data display based on more intuitive display of data to meet the actual needs of data analysts work.The design and implementation of the Hadoop-based video log analysis system has been designed and implemented in terms of function and performance,and has been applied in the company.The test results show that the system can provide data services for data analysts and corporate managers.
Keywords/Search Tags:Data Analysis, Large Data, Log Collection, K-means, Hadoop
PDF Full Text Request
Related items