Font Size: a A A

Design And Implementation Of User Data Analysis Platform For Video Website

Posted on:2023-09-23Degree:MasterType:Thesis
Country:ChinaCandidate:P T JinFull Text:PDF
GTID:2558306848955569Subject:Software engineering
Abstract/Summary:
With the development of technology,today’s Internet age is full of new things and new technologies.With the continuous growth of network users and the development of the Internet,the video industry has been greatly promoted,so the amount of data generated by the video industry should not be underestimated.The current video data presents the characteristics of diversification and large amount of data.Faced with such a huge and complex data set,if we extract value from the data,we must store,analyze and process the data through various technical means,so as to extract the value of the data from the huge data lake.Nowadays,many video websites are faced with the characteristics of large amount of data,complex business,and data diversification,but there are practical problems such as difficult analysis,management,processing,and operation.If you want to quickly query massive data and improve data service quality,it is particularly important to establish a video website user data platform.This paper designs and implements a video website user data platform system.The main contributions are as follows:(1)This paper analyzes the data query function and data development function of the current video website user data platform,finds the shortcomings of the current video website user data platform,such as high query delay and slow execution,and combs the related technologies in the video website big data system —— Streaming streaming data processing technology,Spark RDD(Resilient Distributed Datasets,RDD)scalable distributed data analysis technology,HDFS distributed file storage system,Kafka distributed message middleware,etc.After combing and comparing,This paper initially solves the problem of high data query latency through multi-task parallel processing technology.(2)The platform has designed several functional modules,among which the core modules include data quick query,data development module,intelligent operation and user churn behavior.This paper focuses on Hadoop and spark,designs the architecture of the system,and improves the speed of data query through multi task parallel processing technology.(3)Implements Spark batch periodic tasks in the data task development module by using Hadoop and Spark technologies;Complete the real-time data processing function in the user churn behavior analysis module through spark streaming and Kafka message middleware;The efficient data query function in the data query module is realized by using hive and spark SQL intelligent selection technology;realize the function of pushing the morning report in the intelligent operation module,and designs the data local cache algorithm.(4)Finally,the function of the system is evaluated and tested.Through various tests,it shows that the system achieves the desired effect.By designing the video website user data platform system,it not only expands the functions of the existing system of the enterprise data Department,but also further improves the data support for other departments of the enterprise.
Keywords/Search Tags:Video big data, Local cache algorithm, Distributed, Spark
Related items