Font Size: a A A

Research And Application Of Deep Learning Based Stock Market Prediction And Evaluation

Posted on:2021-04-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:G LiuFull Text:PDF
GTID:1368330605981264Subject:Intelligent Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the global economic integration process,the stock market is playing an increasingly important role in the global economy.Precise prediction of the stock market has important socio-economic value.The stock market has extensive and heterogeneous mass data.Such data characteristics present new challenges for capturing the implicit patterns and correlations in the stock market.Therefore,the research for stock market prediction has important academic value.In recent years,deep learning has made remarkable progress in many fields such as computer vision,speech recognition,and natural language processing,showing that it can process various data types,especially multi-scale(different time scales such as seconds,minutes,days and weeks),multi-source(different sources such as stock markets,social networks,and web news)and heterogeneous data(different forms such as numerical,text,and images).It provides a powerful tool for predicting stock markets with multi-scale,multi-source and heterogeneous characteristics.Based on in-depth research on the characteristics of stock market prediction problems and extensive analysis of existing related research work,this dissertation proposes a series of solutions to the three key issues of stock market forecasting technology,and evaluate the proposed solutions on multiple publicly real datasets.Specifically,the main research task and the results of this article include:This dissertation proposes a Multi-Scale Recurrent Convolutional Neural Network(MS-RCNN)model to handle the stock market prediction with multi-scale data.The model consists of four layers:the reconstruction layer,the feature layer,the fusion layer,and the output layer.Firstly,the reconstruction layer reconstructs multi-scale data from the input data.Secondly,the feature layer uses the convolution networks to automatically extract features from one scale data as the representation for the state of the stock market.Further,a multi-scale stock market state representation is constructed by ensembling representations from various scales.Next,the fusion layer generates the fusion representation by capturing the temporal correlation in the multi-scale representation and the complementarity between different scales through a Recurrent Neural Network.Finally,the prediction is made by feeding the fusion representation into the output layer.The experimental results on three financial time series data set sourcing from the Chinese stock markets show that the MS-RCNN model achieves the highest accuracy in trend classification and profits in market simulation,respectively,compared with many existing advanced baseline methods.Specifically,it reaches 54%accuracy and 55%FI respectively,which is at least a 2%raise than the optimal baseline model.Besides,the accumulated profit of the model in market simulation has increased from 2%up to 10%over the optimal baseline model on three datasets.This dissertation proposes a Numerical-Based Attention(NBA)model for multi-source heterogeneity stock market prediction,The stock market includes heterogeneous data from multiple sources,which mainly are structured numerical data and unstructured text data.These heterogeneous data from different sources produce various impactions on the stock market.Numerical data includes level information of the stock price changes,while textual data mainly contains trend information on the stock price movements.The NBA model can effectively utilize the complementarity between numerical and textual data to predict stock prices.The method first encodes the numerical data and the textual data into the fixed-length vector by the multi-source encoder.Then it applies the encoded text to guide the calculation of the attention weight for the numerical data,thereby obtaining a hybrid content vector.To this end,the stock trend information hidden in the textual data is converted into the importance distribution of the numerical data,that is,the selection of the numerical data is guided by encoded textual data.The NBA model can effectively filter noise and make full use of trend information in textual data.To evaluate the NBA model,this dissertation establishes three data sets by collecting news corpus and numerical data from two stock markets,China Security Index 300(CSI300)and Standard and Poor's 500(S&P500).Extensive experimental results show that NBA achieves the highest classification accuracy and the lowest average mean square error in multi-source stock price forecasting compared to multiple advanced baseline models.In particular,the NBA model accuracy rate was 1.75%,2.60%,and 6.04%higher than the optimal baseline model on the three data sets,and reached 64.47%on the minute frequency data set.A new metric,the Mean Profit Rate(MPR),is proposed to overcome the profit bias in individual stock prediction model evaluation.The existing stock market prediction model evaluation often uses classification metrics such as accuracy,which leads to inconsistency between the profitability and the classification performance of a given model.This inconsistency will cause the lower profitability of the optimal model chosen by the metric than other lower metric value models.To address the aforementioned problem,this dissertation proposes the new metric.It takes the expected profits of each prediction into consideration to avoid the profit bias.The proposed metric can effectively evaluate the models withoutprofit bias.MPR considers the expected profits of each prediction together with the accuracy of each prediction to comprehensively evaluate the average expected rate of return for each model's prediction.Extensive experimental experiments on multiple advanced stock market prediction models on five stock index daily datasets across four countries show that the correlation between the MPR and the profits in simulated trading is significantly higher than the classical classification metrics,and it also has lower chances to select the less profitable model than the classical classification metrics.These findings indicate the MPR is a more effective metric of individual stock trend forecasts than classification metric in model evaluation.Finally,we develop a Tianyan Stock Market Analysis System(TSMAS),a stock market analysis and visualization system for the Chinese stock market,based on the above researches on stock market prediction.The system has three functions:The first function is a multi-scale stock market forecast based on single-scale stock data input.The second function is a multi-source stock market forecast based on numerical and textual data.The third function is to evaluate the above analysis results based on MPR and many other metrics.
Keywords/Search Tags:Stock market prediction, metric, deep learning, Multi-scale, Multi-source
PDF Full Text Request
Related items