| With the development of Chinese futures market,futures with its flexible trading mechanism is gradually getting investor attention.As an important target reflecting market expectations,the CSI 300 stock index futures have also become the research focus of academic circles and financial practitioners.Based on the trading data of stock index futures,this paper aims to use technical analysis methods and machine learning methods to establish the daily market and intra-day highfrequency prediction model,and further construct the investment strategy according to the output results of the model.In order to build a strategy that can bring significant excess returns in the backtest,this paper first analyzes the high-frequency trading data,and combined with the relevant theories of the immediate price shock model,uses multiple regression,cluster analysis and other methods to measure the transaction intention of both buyers and sellers in a single day,and finally builds an index of buying and selling intention for daily market prediction.Secondly,the present traditional technical indicators are screened and classified,which are respectively volatility index,volume price index and momentum index.In order to alleviate the collinearity problem among various indicators,this paper uses the method of principal component analysis to extract the index information.In the construction of prediction model,SVM,XGBoost,random forest and LSTM models were tried successively and the parameters were optimized.The investment strategy of daily market prediction is constructed based on the voting results of multiple models.The paper sets the positions according to the votes.Since the margin of stock index futures is only about 15%,the paper calculates the combined backtest net value under different leverage levels in turn.Finally,the paper introduces a performance evaluation system to compare the risks and benefits of the portfolio and the benchmark.The daily market forecast often can not meet the needs of institutional investors,so the paper goes further and builds the intra-day high-frequency forecast model.In the link of index construction,in order to comprehensively reflect the information of order book,the paper constructs the index of order book slope to reflect the ordering situation at each moment.In terms of technical indexes,only MACD and RSI indexes applicable to high-frequency data were retained in this paper.The intraday price trend has a strong path dependence,so the paper retains all the data from the first order lag to the fifth order lag to expand the characteristics.Intra-day highfrequency data prediction is essentially a tripartite problem.The paper tries XGBoost,random forest and gcforest models successively.The construction process of investment strategy is similar to the daily market model.However,due to the large number of intra-day high-frequency trading transactions,the transaction cost is an important factor affecting the final return.This paper calculates the portfolio return according to different transaction cost scenarios.It is found that LSTM model is not suitable for daily market data prediction,while SVM,XGBoost and random forest model all achieve an accuracy of more than 51% of the minimum arbitrage threshold,and the investment strategy built based on the above model still achieves an annual excess return of 7.17% without leverage.And all kinds of evaluation indicators are compared with the benchmark has a great improvement.All three models used for high-frequency data prediction have high accuracy,but high-frequency investment strategies will start to lose money when the transaction cost exceeds 0.4 ‰.In the process of research,the paper put forward a variety of methods to mine tick level data information,explored the means of information extraction of existing technical indicators,tried the application of multiple kinds of machine learning and deep learning models in the prediction of stock index futures,and summarized a path from the model results to the construction of investment strategy. |