Font Size: a A A

Research On Multi-factor Quantitative Stock Selection Model Based On Ensemble Algorithm

Posted on:2022-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:S H ZhouFull Text:PDF
GTID:2518306521480024Subject:Business Intelligence
Abstract/Summary:PDF Full Text Request
With the advancement of information technology and the development of various mathematical models and algorithm theories,quantitative investment has become one of the mainstream investment methods in the capital market.Quantitative investment can obtain excess returns as it can avoid the weakness of personal emotionality and help investors find reliable trading models from massive data.Though China's capital market has grown into one of the global major capital markets,the history of quantitative investment is relatively short compared with that in the United States.At present,more and more scholars and industry insiders apply overseas quantitative experience to China's financial sector.The number and scale of quantitative funds expand continuously.Various quantitative investment strategies are blooming as well.Seen in that light,quantitative investment is definitely going through a boom.This paper made a literature review on efficiency factors,quantitative stock selection,and ensemble algorithms at home and abroad.It is found that ensemble algorithms such as XGBoost and Light GBM have achieved great academic research results in environmental protection,energy utilization,credit approval and other fields and also have many practical application in competitions.However,there is not much research on applying multiple ensemble algorithms to the Chinese A-share market for quantitative stock selection as well as qualitative and quantitative analysis of different algorithms.Therefore,this paper is directed towards the multi-factor quantitative stock selection model based on ensemble algorithms.By the end of 2020,the market value of the stocks included in the CSI 300 Index accounted for 57.34% of the total market value of China's A-shares.The CSI300 Index can be used as a benchmark for the performance of the investment portfolio constructed in this paper.This study takes the constituent stocks of the CSI300 Index as the research object and collects stock data from the first quarter of 2011 to the fourth quarter of 2020,including a complete bull market,a bear market,and a shock market.In terms of factor selection,based on the factor pool framework obtained by previous scholars' systematic research on the long-term efficiency factors in the United States and combined with research exploring the efficiency factors of the domestic A-share market,this study constructs the factor pool on the basis of available data which include 98 factors such as momentum factors,value growth factors,investment factors,profit factors,intangible assets factors and transaction friction factors.In terms of algorithm selection,this paper chooses 1 to2 algorithms from the three ensemble strategies of Bagging,Boosting and Stacking,amounting to 4 ensemble algorithms.For the Bagging ensemble strategy,the classic random forest(RF)is selected.For the Boosting ensemble strategy,the XGBoost and Light GBM algorithms that are better than GBDT are chosed.For the Stacking ensemble strategy,this paper employs RF and the XGBoost algorithm with differences and a good single effect in the first layer,and the Light GBM algorithm in the second layer to construct the RXL-Stacking algorithm.This paper takes quarters as the cycle,the first 6 quarters are the training set,the 7th quarter is the testing set,and the ensemble algorithm is used for rolling training and forecasting of historical data.In the candidate stock list of the current quarter generated by the algorithm,the stocks are sorted in descending order of their probability of defeating the benchmark index.This paper selects the previous k stocks to construct an investment portfolio and do the back test based on equally weighted capital allocation(k ??10,50,100,150,200?).Because there are 4ensemble algorithms and 5 stocks to open positions,this paper has constructed 20 investment portfolios.Finally,this paper will evaluate the performance of different investment portfolios from the perspective of profitability and risk.This paper has found that:(1)Among the four ensemble algorithms,only a few parameters can improve the performance of the algorithm,and the rest can be set to default values.To improve model effectiveness,considering the limited effect of optimization of parameters,it would be better to adopt some methods to improve the data quality,such as factor selection and data preprocessing.(2)The Light GBM algorithm has great advantages in memory consumption and time consumption,while the RXL-Stacking algorithm takes more time to adjust parameters,and its running speed is slower than the other three ensemble algorithms.(3)The four ensemble algorithms all have the capability of stock picking in the long run.Among which,the RXL-Stacking algorithm,whose average AUC value in the 34-period rolling training forecast is 0.644,has the best stock picking ability.(4)Returns on net assets,total market value,average turnover rate in the past three months,priceearnings ratio(PER),price-to-book ratio(PBR)and other factors have greater contribution to the classification of the model.These factors of higher importance cover 6 categories of factors,indicating that the factor pool framework in this paper is reasonable.(5)For the established algorithm,portfolio performance is better when the number of open positions is small.For a given number of stocks to open positions,the portfolio constructed by the RXL-Stacking algorithm is better than other algorithms.When the RXL-Stacking algorithm is selected and k =10 stocks are selected,it is the best investment portfolio at this time.In the back test interval,the annualized yield,Sharpe ratio,and maximum drawdown of the best investment portfolio were 19.86%,1.46,and 28.55%,respectively,better than the CSI 300Index's 9.22%,0.23,and 32.69%.Therefore,the quantitative stock selection model constructed in this paper has high classification ability and good stability,which can help investors improve the accuracy of stock selection and obtain higher excess returns.
Keywords/Search Tags:Quantitative stock selection, Multi-factor model, Integrated algorithm, Investment portfolio
PDF Full Text Request
Related items