Font Size: a A A

Research On Multi Factor Stock Model Based On Multidimensional Bayesian Network Classifier

Posted on:2022-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:W X HuFull Text:PDF
GTID:2518306317993659Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Multi factor investment system effectiveness is an important guarantee for the stability of China's financial market.However,the development of multi factor investment system also has some problems.First,the factor's over reuse which leads to the factor's failure.Second,not all variables are linear,the existence of nonlinear correlation will weaken the prediction ability of simple and rough linear multi factor model.Third,the existing risk model takes the past risk as the expected risk,which affects the accuracy of stock selection to a certain extent,and the time cost of model application is too high.Fourth,the existing multi factor investment system is split in the prediction of income and risk,which is not multidimensional.With the rapid development and application of machine learning technology in the financial field,the academia and industry have conducted in-depth research on it,and constructed a variety of effective nonlinear stock selection models with SVM,neural network and integrated learning algorithm as the core.In addition,homogeneous factors no longer meet the requirements of effective model construction,and the construction of new factors based on Internet big data with wide coverage has aroused widespread concern.Derwent first constructed emotional strategies according to twitter data in 2011,expanded factor library,and issued stock funds.Then in 2014,baidu Jinrong and Guangfa fund first constructed financial Baifa 100 Index stock fund based on Internet big data in China.It can be seen that an effective multi factor stock selection model with new factors,non-linear and low time cost is always the goal.However,the existing nonlinear multi factor stock selection model does not consider the multidimensional problem.Therefore,based on the increasingly influential stock Internet big data,this thesis constructs a MDMF model by crawling snowball network to construct public sentiment factor and based on MBCs with good variable relationship representation ability,and proposed IBNL algorithm to optimize the stock selection ability and guarantee the time cost.The main contents and innovations are as follows:(1)Class-bridge decomposable can effectively reduce the reasoning time complexity of MBCs.Aiming at the lack of effective learning methods for CB-MBCs,this thesis proposed IBNL algorithm.First,learns class subgraph based on the general BN methods;Second,a set of feature variables with high correlation is found for each class variable in turn based on the information gain rate,only the directed edge between the class variables and the feature variables with the highest information gain rate is reserved to construct the bridge subgraph;Third,based on the general BN methods and BIC learns and updates the feature subgraph,judges whether the directed edges between feature variables are added based on the class parent nodes of feature variables.In a word,the algorithm considers the influence of predictive variables and their relationships on the relationships among other variables,and describes the complex nonlinear variable relationships well;uses higher BIC scores to ensure the accuracy of the model structure,and further improves the accuracy of multidimensional classification;in addition,the algorithm learns class-bridge decomposable model which easy to reasoning,and guarantees the time cost,which lays a foundation for the study and research of CB-MBCs.(2)Aiming at the problem of factor failure,the stock price impact factor is constructed according to the stock information of social platform whose influence increases year by year with the network dispersion.It leads to the increase of excess return probability and enters the mainstream research trend.This thesis crawls the snowball data with large number of users and stable website,and constructs the public sentiment factor based on the Ada Boost algorithm which is not easy to over fit and has high precision.First,we crawled the stock related data released by some "big V" users from December 1,2009 to December 31,2020;Second,we regularized the data according to the time and the stock name,cleaned the text data,segmented the words and constructed the word vector;Third,combined with the yield based on Ada Boost,the probability of positive forecast return is taken as the public opinion factor;Fourth,the experimental results verify the effectiveness of the public sentiment factor.(3)Aiming at the validity of nonlinear multi factor stock selection model,according to the essence of stock selection(weigh the return and risk).This thesis first combines the public sentiment factor with 40 factors(from growth,momentum,quality and other different modules)as feature variables;Second calculate the next period value of return and risk indicators: Yield,volatility,Sharpe ratio,maximum pullback as class variables;Third unify the time series data to build the model data set,and build the MDMF model based on the IBNL algorithm to predict the future return and future risk.Compared with other investment methods,the model considers four dimensions of indicators and their relationship,clearly depicts the nonlinear relationship between income and risk and each factors,which is more in line with the investment practice theory of weighing income and risk,and is a new idea of nonlinear multi factor model for stock selection.It not only makes full use of the advantages of CB-MBCs in multidimensional classification and easy to reasoning,but also avoids the "black box" of machine learning methods application.At the same time,it expands the application of MBCs in the financial field.
Keywords/Search Tags:Multi factor investment system, MBCs, MDMF, IBNL, Public sentiment factor
PDF Full Text Request
Related items