Research Of Recommendation Algorithm Based On Contextual Restless Multi-armed Bandit Model

Posted on:2019-05-10

Degree:Master

Type:Thesis

Country:China

Candidate:M K Zhang

Full Text:PDF

GTID:2428330548956875

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

The main function of the recommendation system is to select the product or information from the recommended candidate pool based on the user's interest.The recommendation algorithm has been paid much attention because of its wide application and important theoretical value.The traditional recommendation algorithm mainly through analyzing the user's historical behavior to discover these users' interest,These algorithms have a good performance when the recommendation candidate pool and the user relatively are static scenario,but the rapid development of Internet technology has put forward a lot of new challenges to traditional recommendation algorithm: firstly,a large number of new products and new users continue to appear,The traditional recommendation method can not work accurately because of the lack of interactive records of new users.Secondly,the Internet data has dynamic characteristics,and the traditional recommendation algorithm based on the user's historical behavior analysis is weaker in the dynamic adaptability of data in time and space.The contextual multi-armed bandit(CMAB)model is widely used in online recommendation applications in recent years.Compared with the traditional recommendation algorithm based on the user's history behavior,the recommendation algorithm based on CMAB can update the recommendation strategy continuously with the continuous feedback information,dynamically handle the user data and effectively alleviate the cold start problem,so it is more suitable for the scene of many new projects to be continuously added to the recommended candidate pool.Although the recommendation algorithm based on CMAB has excellent theory support and application effect,there still exists the problem that the recommended feedback results can not be fully observed,this problem will reduce the excavation ability of the algorithm to the long tail data and influence the coverage of the recommended algorithm.In order to solve the problem that the recommended feedback results are partially observed in the CMAB model,based on the idea of modeling the state of the project in the RMAB model,this paper improves the contextual multi-armed bandit(CMAB)model,expands it into a contextual restless multi-armed bandit(CRMAB)model and applies it to the recommended algorithm.This article mainly consists of the following three parts:Firstly,we proposed a method of linear context reward combined with collaborative information.Because of the proposed algorithm based on CMAB model is used to calculate the expected reward without considering the interaction between users.We proposed a method which using the influence relationship between users.Secondly,In this paper,we propose a Thompson sampling algorithm based on the model of restless contextual multi-arm bandit.Because the LinTS algorithm has high precision and low computational complexity,this paper takes LinTS algorithm as the basis,and uses the idea of modeling item state in RMAB model to improve LinTS algorithm,The algorithm is able to estimate the status of the item without the feedback information and the corresponding expected return.And when calculating the expected benefits of each project,it not only relies on contextual information,but also considers the impact of the state of the item,and improves the LinTS algorithm into a contextual restless multi-armed bandit model based Thompson sampling algorithm(CRTS).Third,the CRTS algorithm is tested on the real dataset.In this paper,we compared with CRTS,LinTS and RTS in Movie Lens dataset and Last FM dataset,the experimental results show that the CRTS algorithm can obtain higher coverage while guaranteeing the low cumulative regret.

Keywords/Search Tags:

contextual multi-armed bandit, Markov decision process, coverage, recommendation algorithm

PDF Full Text Request

Related items

1	Study On Relay Selection Algorithm Based On Multi-Armed Bandit In Underwater Acoustic Cooperative Communication Networks
2	Research On The Algorithm Of Personalized Learning System Based On Multi-armed Bandit
3	Portfolio Model Construction Based On Adaptive Multi-armed Bandit Algorithm Improved By Genetic Algorithm
4	Research And Application Of Large-scale Multi-armed Bandit Algorithm
5	A Dynamic Pricing Algorithm Of Niche Goods Based On Multi-Armed Bandit Model
6	Microblog Recommendation Model Based On User Clustering With Multi-armed Bandit Algorithm
7	Research On Channel Selection Mechanism Based On Multi-armed Bandit In Cognitive Network
8	A Contextual Bandit Approach To Personalized Online Recommendation Via Sparse Interactions
9	Research On Multi-armed Bandit Aided Online Learning Approach For Wireless Caching Strategy
10	Research On Optimal Selection Strategy Of Search Engine Keywords Based On Multi-armed Bandit