Font Size: a A A

Portfolio Model Construction Based On Adaptive Multi-armed Bandit Algorithm Improved By Genetic Algorithm

Posted on:2022-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LiuFull Text:PDF
GTID:2518306731994209Subject:Finance
Abstract/Summary:PDF Full Text Request
Under the background of artificial intelligence technology,quantitative investment research based on financial mathematics and computer technology has entered the stage of quantitative analysis,so that modern financial investment theory is beginning to get rid of the transition based on personal experience and descriptive research.In these quantitative investment studies,most of them use machine learning,reinforcement learning,optimization algorithms,et al,building models for quantitative stock selection,those in both developed capital markets and emerging markets have achieved good returns.Since entering the 21 st century,China's capital market is not perfect and mature enough in general although it has been developing rapidly.At the same time,China is also facing the challenges brought about by the financial crisis and frequent market fluctuations.Therefore,researches on building portfolio theory through quantitative strategy are more and more theoretical and practical to resist market risk and stabilize market order,and the reinforcement learning and machine learning algorithm are also used to construct portfolio selection model in this paper.This paper describes the problem of portfolio selection as a Markovian Decision Processes,that is,making sequential decisions of portfolio under uncertain conditions.In this paper,the Contextual Linear Upper Confidence Bound algorithm from the bandit series algorithm is introduced,together with,the Expected Utility Theory is adopted to define the degree of investors' preference for the portfolio,and the criterion of portfolio selection is defined as the confidence upper bound of investors' preference for the portfolio.According to the unique online learning nature of the reinforcement learning algorithm,the model can learn by itself through multiple iterations during the experimental period,and finally select the most satisfactory portfolio for investors.At the same time,in order to optimizing the parameters of Lin UCB algorithm,we introduces the genetic algorithm,ensuring that the model is always effective in the operating environment,and finally maximize the cumulative return.This paper carry out our research mainly from the following aspects: First,we build an adaptive multiarmed bandit model based on the Expected Utility Theory to select portfolio and update its weight;The second is to optimize the parameters of adaptive multi-armed bandit based on genetic algorithm.The third is to show and compare the results of the selected optimal portfolios.The results show that,the portfolio selection model of adaptive multi-armed bandit algorithm which improved by genetic algorithm constructed in this paper has excellent returns in the experimental period,and the reverse running results are better than the CSI300 index and four different types of funds when facing investors with different risk preferences,indicating the effectiveness of the strategy model.Through further analysis,it is proved that the model can learn to understand the investors by learning during the experimental period.When the model runs near the adjustment period,the recommended portfolio perform better.
Keywords/Search Tags:portfolio selection, sequential decisions making, multi-armed bandit algorithm, genetic algorithm, quantitative investment
PDF Full Text Request
Related items