| With the development of artificial intelligence technologies such as deep learning,a large number of intelligent end devices have been deployed in different real world scenarios.These devices have the capabilities of data perception and computing,and they also have been widely used in various fields,including traffic monitoring,status classification,trajectory tracking,etc.However,the powerful capability of deep learning is based on its huge scale of computational requirements.In order to be able to accomplish the demand for real-time analysis of large-scale data in resource-constrained scenarios(limited computational capacity of the end device and the bandwidth limitation of the network),an end-cloud collaboration framework is usually used to process the collected data.Existing filtering techniques select a portion of video frames and upload them to the cloud,aiming to reduce computational overhead and bandwidth consumption while improving the accuracy of system inference.Since the dynamic change of data distribution over time in real scenarios,many existing filters that are executed based on single modal information lack stability and robustness under long-term conditions.We find that heterogeneous filters based on different modal features have different performances during work,which leads to different local optimal filters.It inspires us that:although it is difficult for a single filter to meet the robustness requirement,the overall optimal performance can be obtained by scheduling locally optimal filters in real-time.In this dissertation,we design a reinforcement learning based heterogeneous filter scheduling framework(HeteroPush)to adaptively schedule locally optimal filters to perform filtering tasks,aiming to maximize the cost-effectiveness of the communication overhead(e.g.,the increase of system accuracy per unit upload).We apply it to the video analysis scenario to evaluate HeteroPush comprehensively with the data of two transportation systems in real word.The main work of this dissertation is as follows:(1)Based on existing research,three types of heterogeneous filters are summarized based on the modality of the evidence information for filtering:input-based,outputbased,and knowledge-based filters.We improve or redesign the heterogeneous filter structure to make it suit to be deployed on end devices from multiple perspectives,such as the limitation of computing resources and real-time performance.(2)The implementation of scheduling heterogeneous filters needs the available feature extraction from the information on end devices.However,most deep learning methods cannot be trained properly due to the presence of a large number of noisy samples(The samples with conflict filtering labels but the same inputs).So We model the distinguishable feature based on the performance trends of lightweight models and filters,then we design a scheduler based on the deep reinforcement learning framework.we also propose a bonus reward mechanism to improve filtering accuracy and training efficiency.(3)The historical performance trends of filters and lightweight models used for extracting great features and is a critical part.However,the performance of the filter can only be accurately assessed after uploading to the cloud and getting the output of the heavy model.But once this is done,the deployment of the lightweight model has no meaning.Therefore,how to accurately evaluate the performance of the filter with the part missing real labels is a key challenge that needs to be addressed.To this end,we design a model performance profiler based on a dynamic scaling window to explore the relationship between the number and upload rate of historical data within the window and the performance evaluation accuracy through a data modeling approach.We also design a set of non-learning window scaling formulas that can cost-efficiently profile and evaluate the performance of filters and the lightweight inference model in real-time.(4)The above modules are combined to build a heterogeneous filter scheduling framework:HeteroPush.We implement it and simulate its running process under two real monitoring scenarios,including traffic flow monitoring and traffic state classification.We also analyze the experiment results in all aspects,and we also evaluate the accuracy of the video analysis system under some resource constraints.On the traffic state classification dataset,Heteropush achieves 90%cost-effectiveness with only 2.7%additional computational overhead introduced.The contributions of this dissertation can be concluded as follows:(1)We find the long-term performance instability problem of heterogeneous filters and analyzes the opportunities to solve the problem.(2)We propose a heterogeneous filter scheduling framework based on reinforcement learning to improve the long-term stability and robustness of the data analysis system;It includes the improvement and design of the heterogeneous filters,the design of the scheduler of filters,and the black-box model performance feedback method.(3)We implement the proposed filtering framework in the video analysis scenario and verify the effectiveness of the proposed framework based on the data in the real system. |