Font Size: a A A

Research On RFID Indoor Positioning Algorithm Based On Deep Reinforcement Learning

Posted on:2021-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2518306110460144Subject:Information processing and communication network system
Abstract/Summary:PDF Full Text Request
Radio Frequency Identification(RFID)has the characteristics of non-contact,automatic identification,so it has a broad application prospect in indoor positioning.Existing RFID indoor positioning technology still has many problems in terms of positioning accuracy,positioning stability,positioning time,environmental adaptability and so on,when dealing with complex positioning environments and high-density positioning spaces.In recent years,people have introduced machine learning into RFID indoor positioning,and used neural networks to fit RFID positioning target positions.However,when faced with large-scale positioning,the shallow neural network positioning and fitting capabilities in traditional machine learning are limited.Therefore,this article introduces the idea of deep reinforcement learning into RFID indoor positioning,which has self-correcting and feedback mechanisms,also it is suitable for large-scale multi-target RFID dynamic positioning.The main research contents and innovations of this article are as follows:1.Propose an RFID indoor positioning based on asynchronous advantage actor critic(A3C).This algorithm deep analyzes the RFID indoor positioning algorithm based on neural network,and which no longer directly inputs the received signal arrival intensity(RSSI)or other positioning information into the neural network,but introduces the reward feedback of reinforcement learning mechanism.The indoor positioning process of RFID is regarded as a Markov decision process.At the same time,the action evaluation in reinforcement learning is combined with the deep neural network to establish the positioning action,environment,and reward.The multi-thread parallel network is used to train the positioning model,so that the model can dynamically adapt to the environment and achieve dynamic positioning.2.Propose an RFID indoor positioning based on semi-supervised actor-critic co-training(SACC).This algorithm combines actor critic with random action and selects unlabeled best RSSI data by co-training with semi-supervised.It also uses the labeled RSSI and unlabeled RSSI data for co-location,and which updates the actor and critic by employing Kronecker-Factored approximation calculate natural gradient.This method can not only dynamically adapt to the environment,but also significantly reduce the number of tags and the cost of location.Experimental results show that the RFID indoor positioning algorithm based on SACC has faster convergence rate,lower positioning cost,better positioning ability and better robustness.3.Propose an RFID positioning algorithm based on Proximal Policy Optimization(PPO).This algorithm associated actor critic with random actions,and further maximize the return of the action,and selects the best coordinate value.Meanwhile,the algorithm leads in clipped probability ratios to limit action in a certain range,alternately uses sampling data from the policy,and updates multiple epochs policy of minibatch with stochastic gradient ascent and evaluates the action by the critic network,finally,it can get the PPO positioning model.Experimental shows this method can effectively reduce the positioning error and improve the positioning efficiency.At the same time,it has faster convergence speed,especially when dealing with large number of positioning targets,it can greatly reduce the computational complexity.
Keywords/Search Tags:Radio Frequency Identification, Indoor Positioning, Deep Reinforcement Learning, Actor Critic, Co-training
PDF Full Text Request
Related items