Research On Mean-Variance Portfolio Selection Problems Based On Exploratory Entropy Regularization Framework

Posted on:2024-03-17

Degree:Master

Type:Thesis

Country:China

Candidate:Y T Wei

Full Text:PDF

GTID:2568306923472974

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,with the continuous maturity of computer technology,the intelligent tools have been widely used in industry,military,finance and other fields.The application of reinforcement learning which is more active in the field of artificial intelligence,in the field of control has gradually become a research hotspot of scholars at home and abroad.For example,Wang et al.[1]first established an exploratory entropy regularization framework by using reinforcement learning and stochastic control theories.The authors used it to solve stochastic control problems.The portfolio selection problem is a common practical application problem in the field of stochastic control.Its main purpose is to select the optimal investment strategy within the risk tolerance of investors to achieve the expected return of investors.However,for portfolio selection problems based on meanvariance criterion,because the variance of terminal wealth is not linear in expectation,most of the well-known reinforcement learning methods cannot be directly applied.In order to overcome this difficulty,Wang and Zhou[2]used the framework in literature[1]to study mean-variance portfolio selection problems with fixed investment duration under the background of reinforcement learning.However,due to some unexpected events,investors may withdraw from investment activities before the termination time.That is to say,the exit time of investors is random.Therefore,it is of great practical significance to further study the portfolio selection problem with random exit time by combining stochastic control theory and reinforcement learning.On the other hand,the meanvariance criterion will lead to an expected nonlinear term in the objective function.Thus the optimal strategy may be time-inconsistent.That is,the optimal strategy at the initial moment may no longer be optimal at the later moment.However,most investors prefer a strategy that is optimal at all time.Thus,it is necessary to obtain a timeconsistent strategy of mean-variance portfolio selection problems under the background of reinforcement learning.Therefore,this paper firstly studies the random time horizon mean-variance portfolio selection problem and its pre-commitment strategy under the framework of exploratory entropy regularization.Secondly,the time-consistent equilibrium strategy for mean-variance problem is studied under the exploratory framework.The main research contents are as follows:(1)This paper studies a mean-variance portfolio selection problem with random exit time and its pre-commitment strategy under the exploratory entropy regularization framework,in which investors can exit randomly,and their goal is to maximize the expected average return and minimize the variance at exit.Firstly,by constructing an auxiliary problem corresponding to the exploratory mean-variance portfolio selection problem,a multi-objective optimization problem is transformed into a single-objective optimization problem.Secondly,by using the probability distribution of exit time,the problem with random time horizon is transformed into a problem with deterministic time horizon.Subsequently,the optimal strategy under the pre-commitment assumption is obtained according to the principle of dynamic programming.Finally,the solvability equivalence between the exploratory random time horizon problem and the classical problem is analyzed,and the effectiveness of the established exploratory regularization framework under the random time horizon mean-variance problem is illustrated,which lays a theoretical foundation for the application of reinforcement learning in this kind of random time horizon mean-variance problems.(2)This paper studies a mean-variance portfolio selection problem and its time-consistent equilibrium strategy under the exploratory entropy regularization framework.Since the problem of portfolio selection based on the mean-variance criterion will have timeinconsistencies,the classical Bellman optimality principle no longer applies.Firstly,for the problems with risk preference being constant and being state-dependent,the extended HJB equations are given respectively.Secondly,the corresponding time-consistent equilibrium strategies for exploratory problems are given respectively by means of the extended HJB equation and the Lagrange multiplier method.Finally,the solvability equivalence between the exploratory mean-variance portfolio selection problem and the classical problem is analyzed,and the effectiveness of the established exploratory entropy regularization framework under the time-inconsistent mean-variance problem is demonstrated.This lays a theoretical foundation for the application of reinforcement learning in this kind of time-inconsistent mean-variance problem.

Keywords/Search Tags:

Entropy regularization, mean-variance, portfolio selection problem, random time horizon, time-inconsistent

PDF Full Text Request

Related items

1	Markov random fields on time-varying graphs, with an application to portfolio selection
2	Research On Portfolio Selection Problems Based On Emperor Penguin Optimization Algorithm
3	Research On The Application Of A Real-Time Tracking Problem Based On Regularization Method
4	Research On Entropy-based Fuzzy Portfolio Selection Model And Algorithm
5	Research On Classification Problem Based On Large Amount Of Inconsistent Data
6	Research On Key Technologies Of High Precision Inter-station Two-way Time And Frequency Transfer
7	Nvestment Portfolio Problem Modeling And Multiobjective Evolutionary Algorithm Research
8	Software Random Number Generator Design And Implementation Technology Research
9	Research On The Problems Of Nonstationary Random Signal Analysis And Processing Based On Local Wave Method
10	Application Of Particle Swarm Optimization To Portfolio Selection