| The stock market is an important component of the socialist market economy and the fluctuation of stock prices can reflect the development of the macro economy.With the daily trading of the market,stock price forecasting has gradually become one of the common concerns of the current academic community.However,stock prices are influenced by many factors and are constantly in a non-linear and dynamic state.A more accurate forecast would help investors to reduce investment risks and build portfolio investment strategies,and would also provide a strong reference for theoretical research on the Chinese stock market.Therefore,the selection and quantification of the factors influencing stock price forecasting and the application of forecasting models are of great theoretical value.Given the sensitivity of stock prices to changes in their influencing factors,their prediction is extremely difficult.Attempts have been made to apply a variety of data sources,including historical trading information,macroeconomic indicators,technical indicators,internet opinion and financial research reports,as well as new modelling methods such as modern econometrics,machine learning and artificial intelligence,to conduct research on stock price forecasting.Different sources of data have different perspectives of influence,and stock price forecasting studies that integrate multiple sources of data can make full use of the correlation information between data and improve forecasting accuracy.To this end,this paper takes Ping An of China(601318.SH)as the research object and uses its four data sources-historical trading information,fundamental characteristics,technical characteristics and sentiment characteristics-to generate 15 feature vectors based on the original 43 predictor variables as a basis for stock price prediction research.In order to avoid the risk of dimensional disaster caused by high-dimensional data,this paper firstly uses Principal Components Analysis(PCA)to dimensionally reduce the input data to eliminate data redundancy and improve the model’s deep learning efficiency;secondly,it introduces the CEEMDAN signal decomposition method to decompose the previous closing price series,and uses fine to coarse reconstruction algorithm to generate new features at different scales to fully exploit the implicit information in the price series and alleviate the lag problem that often exists in forecasting.Then,the principal component scores are fused with the reconstructed new features to construct a Long Short-Term Memory(LSTM)neural network model for stock price prediction,which theoretically lays the foundation for enhancing the generalization ability and prediction effectiveness of the model.The LSTM deep learning algorithm is more capable of handling long-term time series,but before applying the LSTM training learning,relevant control parameters such as the number of hidden layer neurons,dropout ratio,time step and so on need to be assigned.In order to achieve the best performance of the LSTM network model,this paper introduces the Particle Swarm Optimization(PSO)algorithm to achieve intelligent optimisation of the LSTM control parameters to ensure that the input features match the network cell structure,further improving the PCP-LSTM stock price prediction model incorporating multiple sources of data.Finally,two sets of control experiments are set up to verify the validity of the constructed prediction models: a model control group,which compares the prediction performance with other deep learning models;and a data source control group,which compares the prediction performance of the PCP-LSTM model on different datasets.In addition,the robustness of the PCP-LSTM model is tested in two ways: first,by sliding the LSTM algorithm to expand a new dataset outside the test set and using the PCP-LSTM model to make out-of-sample predictions;second,by using the PCP-LSTM model to predict a representative SSE Composite Index based on the variable design of the same method.The experimental results show that the PCP-LSTM stock price forecasting model constructed in this paper has the best forecasting performance and certain generalization ability and robustness,and is practically meaningful based on multi-source data fusion. |