Wireless body area networks(WBANs)have attracted great attention from both industry and academia as a promising technology for continuous monitoring physiological signals of the human body.Since,the sensors in WBANs are typically battery-driven and inconvenient to recharge,an energy efficient resource allocation scheme is essential to prolong the lifetime of the networks meanwhile guarantee the rigid requirements of Quality of Service(Qo S)of the WBANs in nature.As a possible alternative solution to address the energy efficiency problem,Energy Harvesting(EH)technology with the capability of harvesting energy from ambient sources,can potentially reduce the dependence on the battery supply.Research on WBANs is increasing,and based on the current status of research on WBANs energy harvesting and resource allocation issues,energy harvesting is still a major topic for WBANs,and even though many energy harvesting schemes have replaced traditional energy saving strategies,more schemes are still immature and flawed.Throughout these previous works,energy efficiency has rarely been considered,and even if some of them attempt to study energyharvesting wireless body area networks(EH-WBANs),they cover only limited aspects such as total rate,transmission power,and trade-offs between different objectives.To fill this gap,this thesis investigates energy-efficient resource allocation schemes in EH-WBANs with the aim of maximizing energy efficiency while maximizing the end-to-end transmission reliability of the WBANs in this thesis.This is the motivation behind this work.The main contributions of this thesis are as following:(1)In this thesis,we consider the resource allocation problem of EH-WBANs with the objective of maximizing the average energy efficiency of body sensors.The resource allocation problem integrates the transmission mode,relay selection,allocated time slots,transmission power and energy state to make an optimal allocation decision.In this thesis,we formulate the energy efficiency problem as a Discrete-time and Finite-state Markov Decision Process(DFMDP)and propose an improved Q-learning algorithm that reduces the state-action space in the original Q-learning algorithm to solve the modeling problem.From the numerical analysis,this thesis shows that the proposed scheme can obtain the best energy efficiency and faster convergence speed by eliminating the irrelevant exploration space in the Q-table compared with the classical Q-learning algorithm.(2)In this thesis,we aim to maximize the end-to-end reliability of WBANs from the perspective of resource scheduling.In particular,we formulate the resource scheduling problem to be a Markov decision process(MDP)by tactfully designing state space,action space and reward function.Moreover,owning to the problem is non-convex,a deep reinforcement learning(DRL)algorithm is proposed to solve the maximization problem and demonstrating how the transmission reliability can be guaranteed.Compared to the conventional DRL algorithm,the proposed algorithm dynamically predicts the transmission mode,relay selection,time slot allocation and transmit power of each body sensor and then jointly scheduling the resource under specific constraints.Finally,we verify the proficiency and performance merits of the proposed resource scheduling strategy through numerical simulations. |