Font Size: a A A

Research On Mobile Web Deep Learning Inference Based On Edge Computing

Posted on:2022-09-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y K HuangFull Text:PDF
GTID:1488306350988659Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of 5G network technology and the widespread adoption of smart terminals,mobile web applications that combine network capabilities and cross-platform features have facilitated the emergence of many mobile Internet services,including mobile office,mobile life,and mobile social applications.It is foreseeable that with the continuous improvement and enhancement of the native computing capability and efficiency of mobile web browser calling terminals,mobile web services will become the mainstream form in the future mobile Internet era.However,implementing computationally intensive deep learning tasks on the mobile web is still a difficult experience due to the inefficient JavaScript computing environment and instant loading service delivery mechanism.Even with collaborative computing services using high-bandwidth,low-latency edge computing architectures,there are still many challenges to achieving efficient,energy-efficient deep learning inference on the mobile web,including:(1)existing distributed deep learning inference techniques are difficult to deploy and run directly in mobile web,and estimation models of inference latency and energy consumption lack consideration of inefficient computing environments and dynamic contexts.They do not provide distributed inference solutions that balance multiple optimization objectives.(2)The weak computing environment of the mobile web results in the most intensive deep learning inference being typically offloaded to edge and remote clouds.There is a lack of lightweight collaborative inference services that can independently perform task inference on the mobile web.The dynamic and complex environment makes it difficult for fixed lightweight collaborative branches to provide context-aware adaptive inference services.Therefore,balancing the inference efficiency and accuracy of lightweight branching models is one of the challenges of collaborative inference.(3)The dynamic and complex operating environment of the mobile web makes it difficult for lightweight collaboration for fixed devices and platforms to provide context-aware and adaptive collaborative inference,and how to balance the inference efficiency and accuracy is also one of the challenges faced by collaborative inference.(4)Although the "Cloud-Edge-Device" architecture provides dynamic,collaborative inference between the mobile web and the server,the mobility,and aggregation of users can lead to high concurrent collaborative request instances.In practice,it is impractical to deploy large amounts of idle server resources in the edge cloud.Therefore,how to provide dynamic task scheduling and resource dispatching for highly concurrent mobile web collaboration requests in an edge computing environment is an important foundation for rational optimization of resource usage and enhancement of service experience.To address these challenges,this thesis investigates the key techniques of deep learning inference for mobile web-based on edge computing,focusing on distributed inference techniques with adaptive partition,lightweight binary branching,collaborative inference with context awareness,and dynamic scheduling of collaborative requests for mobile web,etc.Moreover,the research contents and contributions are as follows:(1)Proposing a distributed deep learning inference with the adaptive partition between mobile web and edge cloud.First,we design a distributed deep learning inference framework with adaptive partition for mobile web,propose inference latency and energy estimation methods with self-learning features,extend the support for ubiquitous terminals and web browsers,and build multiobjective optimization models to provide the best partition strategy.The experimental results show the effectiveness and efficiency when compared with representative distributed deep learning inference schemes on typical datasets.(2)Designing a collaborative inference framework with a lightweight binary branch and providing a joint training method including full-accuracy backbone network and the binary branch,allowing lightweight it to exit collaborative inference early when the accuracy requirement is satisfied.Besides,an efficient binary branching inference library for mobile web platforms is implemented,which effectively supports online collaborative inference between mobile web and edge cloud compared to existing web deep learning inference libraries.Validation on various deep learning models and classical datasets shows that the designed lightweight collaborative inference scheme effectively reduces the inference latency and mobile energy and improves the system throughput.(3)Based on lightweight binary branching for collaborative mobile web inference service provision,we further design a framework for fusing and linking mobile web,edge cloud,and remote cloud to provide a deep learning reasoning service.It proposes a context-aware lightweight model pruning algorithm that considers the inference latency,network bandwidth,and computing capability of terminals for online collaborative mobile web requests.Besides,we introduce a cache model update mechanism and a real-time intelligent model matcher in the online phase to address the issues of cache update and real-time matching of lightweight collaborative branching models,optimize and improve the search efficiency of lightweight branching model requests for massive mobile web users in the online phase.We also optimize and improve the search efficiency of providing lightweight branching model requests to many mobile web users in the online phase.(4)To further improve the inference efficiency,mobile energy consumption,and system throughput of collaborative deep learning,we propose a dynamic scheduling scheme based on reinforcement learning in the edge cloud environment to maximize the resource utilization between edge centers and optimize the use of resources and balance the computational load.Besides,a two-stage online reinforcement learning method is designed to optimize different scheduling objectives at different stages,and a reward prediction model is proposed to effectively reduce the training and convergence difficulties of the model.Simulation results and practical application validation show the effectiveness of the proposed dynamic scheduling algorithm.
Keywords/Search Tags:Edge Computing, Cross-Platform, Mobile Web, Deep Learning, Collaborative Inference
PDF Full Text Request
Related items