| In recent years,the rapid development of network applications,especially the rapid development of video applications,leads to a surge in video traffic,which brings severe challenges to network management.However,at present,the quality of video content is uneven,and some of them convey harmful video information.For example,game videos have brought great potential harm to the growth of minors.Therefore,effectively identifying and managing video traffic has become an urgent problem to be solved.Traditional traffic identification methods mostly use port number matching to complete application identification,or use deep packet inspection methods to complete network application and protocol identification from the traffic level.However,with the development of dynamic port technology and the use of encryption technology for a large number of network traffic data,these two types of traditional methods are no longer applicable,making the traffic identification method based on machine learning widely used,which also lays a solid foundation for video traffic identification.The existing research on video traffic identification either focuses on the Qo S and Qo E prediction of video traffic or constructs video fingerprints to complete the identification of video applications and video titles.However,the current research has not focused on video on demand scenes identification and the early stage identification of game live.Based on this,the following research work is carried out in this thesis:(1)To build a video traffic data set and train the machine learning model,this thesis designs the collection architecture of video on demand,cloud game video,and live video traffic to collect video traffic data.The collected platforms include 2 on-demand platforms(You Tube and Bilibili),4 cloud game platforms(START,YOWA,Migu,and Tianyi),and 3live streaming platforms(Bilibili,Huya,and Douyu).In addition,traffic from online chat,web browsing,file downloading,and other common applications are also collected as negative samples for video traffic identification research.(2)A feature extraction method based on peak points is proposed.Through analysis,the distribution of data packet size and the sum distribution of packet payload per second caused by the same video content is almost constant,but the distribution of different videos is very different.Therefore,we propose a feature extraction method based on packet payload size and byte rate peak points.Based on the collected data,the peak point feature and data flow statistical feature are used to construct the feature set,combined with the feature selection method based on the distribution distance to reduce the feature dimension,and complete the identification of video scene traffic and cloud game video traffic.(3)A feature extraction method based on windowed peak points is proposed.To better represent the continuity of video content,based on the peak point features proposed above,a sliding window is designed to extract the byte rate peak point features.This study conducted experiments on the collected video scenes and cloud game datasets,and analyzed the impact of different window sizes and offset factors on the recognition effect.Experimental results show that this method improves the recognition accuracy of video on demand traffic.(4)A feature extraction method based on the ADU sequence is proposed.Although encryption technology is widely used in network video traffic,different video types exhibit different feature distribution patterns due to the segmented transmission characteristics of the video transmission mechanism DASH.Existing researchers only focus on the identification of video streams under the QUIC protocol,and do not consider the identification of scene video traffic.In this thesis,we define the data between two upstream packet requests as an Application Data Unit(ADU)for the DASH fragmentation mechanism of video on demand.Then,the ADU sequence of each video scene stream is extracted,and the statistical features of ADU and the peak point features of the previous work are calculated to complete the video scene identification.(5)A fast feature extraction method is proposed.At present,most of the work in the field of live video traffic identification is only focused on improving video Qo S and Qo E and has not studied the identification of live scene traffic in depth.Additionally,video traffic identification in the early stage is also an indispensable part of network management.How to quickly identify and classify live traffic in a real environment is also an urgent problem to be solved.Therefore,this thesis extract and analyze the first 10 effective packet payload size each live video stream,aiming to complete the fast and fine-grained identification of live video,on-demand,and other common applications. |