The deep integration of intelligent unmanned systems with the Internet of things,intelligent manufacturing and intelligent transportation has become a new growth point of economy and society.How to fully release the efficiency of intelligent unmanned systems in sensing data has become an important issue.With the help of computation offloading technology in the mobile edge computing paradigm,intelligent unmanned systems can efficiently complete the computing tasks of large-scale sensing data.However,the openness and instability of the wireless network in which the intelligent unmanned system is located facilitate attackers to steal the privacy of the intelligent unmanned system.By combining channel monitoring and inverse reinforcement learning algorithm,the attacker can infer the offloading policy of the intelligent unmanned system by using the vulnerability in the deep reinforcement learningbased offloading approach.Once the offloading policy is obtained,the attacker can easily infer the location of the intelligent unmanned system or identify the identity of the intelligent unmanned system.Therefore,protecting the offloading policy of intelligent unmanned systems has become an important part of the application of mobile edge computing paradigm.To solve the above problems,this thesis focuses on hiding the offloading policy of the intelligent unmanned system,which mainly includes the privacy-aware computation offloading approach based on deep Q-learning,the personalized privacy-aware computation offloading approach based on deep Q-learning,and the privacy-aware computation offloading approach based on multi-agent reinforcement learning.The specific research results are as follows:(1)A privacy-aware computation offloading approach is proposed.Aiming at the problem of offloading policy leakage in deep reinforcement learning-based computation offloading approach in intelligent unmanned systems,a privacy protection approach based on global differential privacy is designed.Firstly,the problem of offloading policy leakage is analyzed,and the vulnerability of offloading preference is found.The theft of offloading preference is expressed as an inference attack,and the assumptions and methods for attackers to launch inference attacks are clarified.Then,based on the deep Q-learning algorithm,the Gaussian noise generated by global differential privacy is added to the action selection and policy update stage of the deep Q-learning algorithm to prevent attackers from stealing offloading preferences.At the same time,to improve the efficiency of the experience exploration process in the online learning process,the prioritized experience replay technique is adopted to update offloading policy by sampling experiences with larger time-difference error in the replay buffer,which reduces the impact of online learning experience exploration on the endurance of intelligent unmanned systems.Through theoretical analysis,the convergence and privacy of the approach are proved.Finally,taking the computation offloading in UAVassisted Internet of Things as the scene,an approach is designed,and a simulation platform is built to evaluate the approach.The experimental results show that the approach can make the offloading decision with high cost-efficiency under the condition of protecting offloading preference.(2)A personalized privacy-aware computation offloading approach is proposed.Aiming at the differentiated privacy protection requirements of each subtask offloading preference in structured tasks of the intelligent unmanned system,a personalized privacy protection approach of offloading policy based on local differential privacy is designed.Firstly,the dependency relationship among each subtask is defined,and the quality of the service model is designed according to this dependency relationship.Secondly,adopting the deep Q-learning algorithm as the basic computation offloading algorithm.Based on the local differential privacy mechanism,Gaussian noise is added to the offloading preference of each subtask in the action selection and policy update stages of the offloading policy learning process.It should be noted that,based on the different privacy protection level requirements of each subtask,a balance factor is set to regulate the noise size to meet the different privacy protection requirements of different subtask offloading decisions.At the same time,the inference attack effect based on the inverse reinforcement learning algorithm is positively correlated with the number of experiences obtained.Therefore,to further reduce the effect of inferential attacks launched by attackers,the prioritized experience replay technology is integrated into the deep Q-learning algorithm to improve the utilization efficiency of experience and reduce excessive experience exploration.The convergence and privacy of the approach are proved by theoretical analysis.Finally,taking the computation offloading in Industrial Internet of Things as a scenario,the effectiveness,privacy protection ability and cost-effectiveness of the PA-MADRL approach are evaluated on the real-world dataset.The experimental results show that the PA-MADRL approach can make a more cost-effective offloading decision for each ground unmanned vehicle in a distributed manner while protecting the offloading preference of each ground unmanned vehicle.(3)A privacy-aware distributed computation offloading approach is proposed.Aiming at the privacy protection requirement of offloading preference generated by the distributed deployment of computation offloading approach in the intelligent unmanned system,a privacyaware computation offloading approach based on multi-agent reinforcement learning is designed.Firstly,under the constraint of protecting offloading preference,a joint optimization problem of offloading action and transmitting power is formulated to minimize system cost,including local cost and edge cost.Secondly,considering the non-convexity and centralization of the joint optimization problem,a multi-agent reinforcement learning algorithm based on the global differential privacy is proposed to solve the joint optimization problem,so that the intelligent unmanned system can achieve Nash equilibrium without leaking the offloading preference.The core idea of this approach is to add Gaussian noise to the action selection and policy update process of the original multi-agent deep reinforcement learning algorithm.However,the kernel function used in the above two approaches is not applicable due to the increase in system state space and action space dimensions.Therefore,a kernel function based on Manhattan distance is designed for the distributed deployment scenario of the computation offloading approach in the intelligent unmanned system.The convergence and privacy of the approach are proved by using game theory.Finally,taking the computation offloading in the vehicle ad-hoc network as the scenario,the effectiveness,privacy protection ability and cost-efficiency of the approach are verified on the real-world data set.The experimental results show that the approach can make highly cost-efficient offloading decisions for each vehicle in a distributed manner while protecting the offloading preference of each vehicle. |