Research On Deep Reinforcement Learning Recommendation Algorithm For Interactive Recommendation

Posted on:2022-02-14

Degree:Master

Type:Thesis

Country:China

Candidate:C Rao

Full Text:PDF

GTID:2518306572997429

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years,with the widespread use of mobile applications such as Douyin,interactive recommendation systems have attracted wide attention.The traditional recommendation system usually focuses on predicting the user's interest at a specific time point,usually the next moment.However,in interactive recommendation,the system not only needs to pay attention to the user's current interest,but also needs to capture the change of the user's interest caused by the current recommendation,so as to make long-term planning,increase the user's usage time and gain benefits.Therefore,traditional recommendation methods are not suitable for interactive recommendation scenarios.Reinforcement learning relies on continuous interactive autonomous learning between agents and the environment,which conforms to the form of interactive recommendation.However,there are some challenges in the application of reinforcement learning in the recommender system: the agent corresponds to the recommender system,and the environment corresponds to the real user.Reinforcement learning requires massive interaction for training,so it is impractical to use the real user.The agent adjusts its strategy based on the reward feedback of the environment,but the reward function of the user is unknown.The value network used by the existing reinforcement learning recommendation algorithm has overestimated value estimates,which affects the recommendation effect.To solve the above problems,this paper proposes a deep reinforcement learning recommendation model UEDR(Offline User Environment Based Deep Reinforcement Learning).UEDR uses OUE(Offline User Environment)to model the user's behavior,and uses recurrent neural network to learn the potential distribution of user's behavior from user's history records,so as to simulate a user's behavior.At the same time OUE uses neural network as a function approximator to learn the user's reward function.OUE uses the ideas of generative adversarial network,to make reward of clicked item as large as possible and reward of unclicked item as small as possible,so as to train itself.Meanwhile,a deep reinforcement learning recommendation algorithm based on actor-critic architecture is designed to improve the Twin Delayed Deep Deterministic Policy Gradient(TD3)to complete the training of the model.TCN(Twin Critic Network)is used to solve the overestimated value estimates caused by value network of existing reinforcement learning algorithms to improve recommendation performance.Using a real data set Movie Lens 100 k,OUE was established as a reinforcement learning environment to train the UEDR model and complete comparative experiments.The experiment results show that the UEDR algorithm has a better recommendation performance than the comparative algorithm.Further ablation experiments show that both the OUE and the TCN can effectively improve the recommendation performance.

Keywords/Search Tags:

Deep Learning, Reinforcement Learning, Interactive Recommend System, Offline User Environment

PDF Full Text Request

Related items

1	Offline Reinforcement Learning Algorithms And Their Applications On Industrial Control
2	DNN Inference Business Scheduling System Based On Deep Reinforcement Learning
3	Research On Motion Planning In Dynamic Environment Based On Deep Reinforcement Learning
4	Supervised Reinforcement Learning:methods And Applications
5	Research On Deep Reinforcement Learning Method For Environment With Non-stationary Dynamics
6	Research On User Generated Seqence-oriented Deep Learning Technology
7	Deep Learning Based Video Inpainting And Reinforcement Learning Methods In Unstable Environment
8	Learning Robotic Grasp Using Deep Reinforcement Learning
9	Agent Environment Perception And Control Decision Based On Deep Reinforcement Learning
10	Research On Active SLAM Algorithm Based On Deep Reinforcement Learning In Complex Environment