Font Size: a A A

Research And Implementation Of Revenue Optimization Algorithm For Demand Side Platform Based On Reinforcement Learning

Posted on:2022-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:J X LiFull Text:PDF
GTID:2518306524980769Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the explosion of information technology,the Internet makes the high-tech industry get considerable development and promotes each traditional industry to radiate new vitality.Simultaneously,the bidding transaction model in the online advertising market is continuously being developed and improved.The real-time bidding(RTB)transaction model stands out,which is based on a form of programmatic transaction and brings unprecedented benefits to the advertising industry.Given the advantages of real-time bidding and the science and technology behind it,various researchers have begun to notice this research field.This thesis mainly studies the two most important and active research points in the demand-side platform(DSP),including click-through rate prediction and bid strategy optimization.Simultaneously,this thesis is based on reinforcement learning and proposes corresponding algorithms to increase DSP revenue.The research results of this thesis are as follows:This thesis proposes an intelligent bidding strategy based on the model-free reinforcement learning algorithm in terms of bid strategy optimization.Unlike the traditional bidding strategy,which regards the bidding decision as a static behavior,this strategy models the bidding decision in the advertising delivery period as a sequential dynamic interactive process.The intelligent bidding agent can generate a more reasonable bidding decision for the advertising impression,that is,to dynamically allocate the budget to each impression based on the current and future long-term revenue.Besides,to improve the efficiency and performance of bidding strategy based on reinforcement learning algorithms.This thesis proposes a bidding strategy based on policy-based reinforcement learning.Unlike the "intelligent bidding strategy," this strategy does not directly generate the bidding decision for the impression but divides an ad delivery period into several time slots.Then,the bidding agent generates each impression's bidding price depending on its estimated value and the bidding factor of its arriving time slot.Therefore,the bidding strategy is simplified to generate each time slot's optimal bidding factor,which can adapt dynamically to the RTB environment.Finally,the experimental results on real-world data sets show the effectiveness of this strategy.Finally,this thesis exploits a model-free reinforcement learning algorithm with hybrid action space(including both discrete and continuous actions)for the first time to improve the accuracy of click-through rate prediction.Meanwhile,this model-free RL algorithm is a weighted average ensemble scheme,and it can generate the proper clickthrough rate for each impression.Meaningfully,the model proposed in this article is not only limited to click-through rate prediction tasks.It also can be applied to other tasks with hybrid action space.
Keywords/Search Tags:Real-Time Bidding, Demand Side Platform, Reinforcement Learning, Click-Through Rate Prediction, Bidding Strategy
PDF Full Text Request
Related items