Font Size: a A A

Stock Price Prediction Research Based On Feature Selection And Improved Stacking Algorithm

Posted on:2019-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y S ChenFull Text:PDF
GTID:2428330548491796Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,computer machine learning has been applied in many fields.More and more researchers have tried to apply all kinds of machine learning algorithms to stock price prediction in order to expect good results.The Stacking algorithm has a good performance in the floating price prediction of Kaggle competitions.Through the combination of the different kinds of regressor,the generalization ability of the model is greatly enhanced,The generalization ability of the model is greatly enhanced,and the algorithm integration of learning implementation after the output of multiple learners can often obtain the prediction results superior to the single model.However,the output fusion of the first level test set of stacking only uses arithmetic mean,which restricts the full exertion of the effectiveness of stacking algorithm.The feature extraction of sample data sets also has the problem that the single screening method is not effective.Therefore,how to effectively select features and optimize the input value of stacking total regressor becomes a problem to further improve the prediction effect.In order to solve the problem that the effect of single selection method in feature selection is not satisfactory,this paper constructs an integrated feature scoring device to synthetically evaluate Pearson correlation coefficient,rank correlation coefficient and Xgboost reverse validation weight factor,and extracts high correlation feature data sets and deletes secondary features.The best peak value of characteristic attributes in Hunghes phenomenon is selected to achieve the balance between noise evasion and favorable feature loss.In view of the simple arithmetic mean,the traditional stacking algorithm ignores the difference between the time consistency and the basic learner differences learning effect in the field of stock price prediction.In this paper,we improve the stacking algorithm and evaluate the accuracy of trained by different samples in k-fold cross validation.Precision weighting is made on the prediction output of stacking first level test set,establish the time weighting model based on the time distance characteristics of the test set with different k-fold cross validation sample,continuous cycle verification test results,and adjusting the weights of the weights,realizing the dynamic optimization of weights.After the computation of precision weighting and time weighting,completed the input construction of test set of stacking second level,and improved stacking ensemble algorithm.This paper takes the stock data of three subsidiaries of Aerospace Science and technology group as an experimental sample,First,regression prediction is used to demonstrate whether the feature selection method of the integrated feature score is effective and find out the number of features corresponding to the peak of the Hunghes phenomenon.Then,we apply the improved stacking algorithm which integrates Ridge regression,Random Forest and xgboost to predict the closing price of three stocks.Experimental results show that the new ensemble algorithm outperforms any single algorithm,compared with the traditional stacking integration,there exists the advantage of "fen" position prediction,it has a certain reference value for short-term stock price prediction.
Keywords/Search Tags:stock price prediction, machine learning, Stacking, Xgboost, correlation coefficient
PDF Full Text Request
Related items