Font Size: a A A

Optimized Factor Selection Model Based On The Ensemble Learning

Posted on:2020-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2428330590971084Subject:Quantitative Economics
Abstract/Summary:PDF Full Text Request
The multi-factor selection model has been a pretty classic method in the quantitative finance field,which is the basis for the subsequent research on stock analysis.From the traditional regression method to the traditional order method,many a researchers have made efforts on multi-factor selection.With the development of computer science,math,and statistics,the way of multi-factor stock selection has become much more various.Multi-factor stock selection concerning machine learning or data mining has drawn more attention.Machine learning is essentially an algorithm to classify.Applying this plain idea to the stock market,Machine learning is just do a binary-classification to predict the stock will rise or fall.In the machine learning field,ensemble learning is very out-standing.Especially the XGBoost algorithm and the Adaboost algorithm,the former is proposed by Chen Tainqi,a doctor from the University of Washington,the latter is one of the most famous machine learning algorithm.In this paper,the traditional score method is initially established to create a stock pool which includes about sixty stocks,then use this stock pool to back test.Then respectively establish the XGBoost method and Adaboost method to create stock pools.Lastly compare traditional methods and ensemble learning methods with the results of back-test.At the outset of this paper,nearly forty candidate factors have been selected into the factor pool,which are probably influence the behavior of revenue.These factors cover both technical analysis and the fundamental plane and so on.From January 2010 to November 2018,the HS300 stocks and their factor data have been selected to conduct empirical research.Firstly,single factor validity tests were conducted on all the candidate factors,then through the correlation analysis of the valid factors,the redundant factors were dropped out,lastly 9 effective factors were selected.Basing on the 9 effective factors,the first factor model was built to create the stock pool.During the backtest period,excess returns relative to the benchmark HS300 were obtained,which is merely 0.3%,this result is not so satisfied.To mine the more information hided in the big data,XGBoost model and Adaboost model are respectively created,During the backtest period,excess returns relative to the benchmark HS300 were obtained,which are 7.3% and 5.5%.This two methods are outperformed the traditional method.Therefore,a conclusion can been drawn that ensemble learning method mined the data more deeply than the traditional method.
Keywords/Search Tags:ensemble learning, XGBoost algorithm, Adaboost algorithm, multi-factor selection model, valid factor test
PDF Full Text Request
Related items