Font Size: a A A

Empirical Research On Stock Selection Based On Naive Bayes,Linear Discriminant,Quadratic Discriminant Classification Algorithms

Posted on:2019-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:H S LinFull Text:PDF
GTID:2370330545453107Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
This paper does the empirical research on selecting stocks based on Naive Bayes,Linear Discriminant Analysis,Quadratic Discriminant Analysis Classification Algorithms,gives the investors suggestions to build portfolio.It's very meaningful in application.This paper introduces theories,Naive Bayes,Linear Discriminant Analysis and Quadratic Discriminant Analysis Classification Algorithms' advantages and disadvantage,algorithm steps and technical details,and applies them to select stocks to build portfolio.Naive Bayes Classification Algorithm assumes that factors are independent from each other.But in most of the circumstances,the factor data does not meet the requirement,and this will lead to imprecision.So we come up with PCA to eliminate the dependency between factors.But the effect is not so good.Then we come up with Linear Discriminant Classification Algorithm and Quadratic Discriminant Classification Algorithm which do not require factors to be independent from each other.The difference between LDA and QDA is that LDA assumes factors from different classes have same variance which is the variance of the whole sample,while QDA assumes factors from different classes have different variances.We use Ledoit-Wolf covariance shrinkage to improve the model performance.This paper selects the constitute stock of CSI300 as the stock pool,select 39 factors from 13 different classes including valuation,growth,financial quality,leverage,market value,momentum,volatility,stock price,beta,turnover rate,sentiment index,shareholder,technical index,labelize the month return rate of train set,select the train period as long as we can,apply the long strategy,short strategy,long and short strategy to do the traceback test,and choose the CSI300 as standard portfolio to compare with.This paper selects accuracy and AUC to compare NB,LDA,QDA classification models' classification effect,selects annual extra return rate,extra return maximum drawdown,information ratio to compare the traceback test performance.In general,LDA's classification effect and traceback test performance are better than other classification models.We believe that this is because LDA considers the dependency between factors,assumes that factors from different classes have same dependency,those assumptions fit the actual condition most.The assumption of QDA is the most detailed,but it has to estimate the most parameters which cause the effect is not so good.When factor number is few and the data volume is relatively large,LDA has overwhelming advantage.Besides,we compare NB,LDA,QDA with Logistic Regression,and come to a conclusion that the classification effect and traceback test performance of LDA and Logistic Regression are similar,LDA is a little better than Logistics Regression,NB,QDA behave not better than Logistic Regression.This paper believes that doing shrinkage while estimating the covariance matrix leads to the improvement of model effect.
Keywords/Search Tags:build stock portfolio, Naive Bayes Classification Algorithm, Linear Discriminant Classification Algorithm, Quadratic Discriminant Classification Algorithm
PDF Full Text Request
Related items