| With the rapid development of domestic IPTV business,IPTV terminals generate and accumulate massive data unprecedentedly.In order to understand users' demands better and provide personalized viewing experience,it is necessary to collect users' data,and analyze the data to generate more valuable data via big data related technologies.Data collection is the first step in the process,which providing raw materials for further big data analysis.The traditional manual collection method of radio and television ratings cannot meet the requirements of IPTV data collection growth,thus systems and methods for automatic data collection based on computer technology emerged.The general method of automatic collection is based on the centralized system of physical machine or virtual machine,which has higher costs and low utilization rate of fragmentation of computing resources.In order to reduce the cost of and make the best use of the computing resources,it is necessary to use a virtualization technology with higher resource utilization to do automatic scaling.Under the background of the rapid development of distributed computing,this thesis mainly studies the related technologies of distributed IPTV data acquisition system based on docker,analyzes the requirements,then designs,implements and deploys the whole system.The distributed IPTV data acquisition system based on docker is divided into four layers: data acquisition application access layer,distributed data cache and persistence layer,container application virtualization layer,container arrangement and visualization management layer.At last,this thesis tests the function of the system to verify whether it meets the functional requirements,and tests the performance of the system to verify whether it meets the non-functional requirements. |