Font Size: a A A

Research On Deep Reinforcement Learning Recommendation Algorithm For Interactive Recommendation

Posted on:2022-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:C RaoFull Text:PDF
GTID:2518306572997429Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the widespread use of mobile applications such as Douyin,interactive recommendation systems have attracted wide attention.The traditional recommendation system usually focuses on predicting the user's interest at a specific time point,usually the next moment.However,in interactive recommendation,the system not only needs to pay attention to the user's current interest,but also needs to capture the change of the user's interest caused by the current recommendation,so as to make long-term planning,increase the user's usage time and gain benefits.Therefore,traditional recommendation methods are not suitable for interactive recommendation scenarios.Reinforcement learning relies on continuous interactive autonomous learning between agents and the environment,which conforms to the form of interactive recommendation.However,there are some challenges in the application of reinforcement learning in the recommender system: the agent corresponds to the recommender system,and the environment corresponds to the real user.Reinforcement learning requires massive interaction for training,so it is impractical to use the real user.The agent adjusts its strategy based on the reward feedback of the environment,but the reward function of the user is unknown.The value network used by the existing reinforcement learning recommendation algorithm has overestimated value estimates,which affects the recommendation effect.To solve the above problems,this paper proposes a deep reinforcement learning recommendation model UEDR(Offline User Environment Based Deep Reinforcement Learning).UEDR uses OUE(Offline User Environment)to model the user's behavior,and uses recurrent neural network to learn the potential distribution of user's behavior from user's history records,so as to simulate a user's behavior.At the same time OUE uses neural network as a function approximator to learn the user's reward function.OUE uses the ideas of generative adversarial network,to make reward of clicked item as large as possible and reward of unclicked item as small as possible,so as to train itself.Meanwhile,a deep reinforcement learning recommendation algorithm based on actor-critic architecture is designed to improve the Twin Delayed Deep Deterministic Policy Gradient(TD3)to complete the training of the model.TCN(Twin Critic Network)is used to solve the overestimated value estimates caused by value network of existing reinforcement learning algorithms to improve recommendation performance.Using a real data set Movie Lens 100 k,OUE was established as a reinforcement learning environment to train the UEDR model and complete comparative experiments.The experiment results show that the UEDR algorithm has a better recommendation performance than the comparative algorithm.Further ablation experiments show that both the OUE and the TCN can effectively improve the recommendation performance.
Keywords/Search Tags:Deep Learning, Reinforcement Learning, Interactive Recommend System, Offline User Environment
PDF Full Text Request
Related items