With the arrival of the 5G era,the amount of data on devices connected to the Internet of Things has increased dramatically,while new mobile applications such as face recognition,online games and mobile augmented reality,which are computation-intensive and time-sensitive,are becoming increasingly popular.Mobile Edge Computing(MEC),as an emerging architecture,provides a candidate solution to solve this problem because the resources of Mobile devices are limited and cannot meet the low latency requirements of such tasks.In MEC system,it is very important to allocate the wireless communication and computing resources reasonably to meet the requirement of task delay/energy consumption.In addition,as an important part of edge intelligence,edge inference based on neural network segmentation becomes an important method to solve the gap between high computing resource requirements and hardware architecture of artificial intelligence tasks.In the scenario of edge computing,this paper respectively studied the joint optimization of computing offloading and resource allocation for computer hardware implementation and fine granularity partition of deep neural network.The specific research contents are as follows.(1)The joint optimization of computing offloading and resource allocation based on hardware awareness in multi-user multi-MEC(Mobile Edge Computing)systems is studied.In view of the previous model coarsegrained computing in the study of optimization scheme of the decline of the precision,fully considering the task offloading process server implementation details of the underlying hardware,on the basis of previous studies based on computer instruction execution granularity,considering the Input/Output bottleneck and the energy consumption of the memory modules,to establish a joint optimization model,The system energy consumption is minimized on the premise of meeting the delay requirement of unloading task.In addition,KM(Kuhn-Munkras)hybrid online twopart matching algorithm based on Deep Deterministic Policy Gradient(DDPG)is used to solve the high-dimensional problem of action space.The simulation results show that the memory energy consumption can not be ignored in the calculation process,and the proposed optimization algorithm can effectively learn the optimal strategy,and has a significant effect on reducing the system energy consumption.(2)In order to further reduce the computation and memory footprint,a new scheme based on convolution kernel partition is proposed on the basis of existing workload partition schemes.The performance of the two schemes was quantitatively analyzed from the aspects of computation,memory consumption and communication cost,and the flexibility,robustness and privacy of the inference process were qualitatively analyzed.Finally,a hardware experimental platform was built and AlexNet and VGG11 networks were implemented with PyTorch to further verify the performance advantages of the proposed scheme in terms of delay and energy consumption.The results show that compared with the workload partition scheme suitable for small number of devices,the proposed convolutional kernel partition scheme has higher computational flexibility and stability.At the same time,it has a good acceleration effect on DNN inference process in large-scale computing scenarios.(3)In a large-scale intelligent computing system consisting of multiuser devices,the joint optimization of computing offloading and resource allocation based on convolution kernel partition is studied.The problem of service interruption caused by the mobility of computing equipment and the limitation of electric quantity is fully considered.A fault-tolerant mechanism is introduced to send each convolutional kernel to multiple cooperative devices at the same time,and the calculation results are transmitted to the relay node for unified verification.In order to further accelerate the inference process,a joint optimization model of convolution kernel offloading and resource allocation was established to minimize the reasoning delay on the basis of satisfying the constraints of system computing,communication resources,device memory and energy consumption.In this paper,Deep Deterministic Policy Gradient(DDPG)algorithm is used to solve the optimization problem for the high dimensional discrete and continuous mixed action space and the complex constraint form.The simulation results show that the scheme based on convolution kernel partition can effectively reduce the inference delay and reduce the memory footprint. |