There are a large number of data with different feature sets collected at different frequencies in our real world,which are called mixed frequency data.The features of this kind of data are that the number of samples is different under different frequencies,and the features are not consistent.At the same time,the mixed sampling data in most scenarios are special time series data.In this thesis,the processing of mixed frequency data is studied from the perspective of deep learning modeling.The variables with different frequency contain rich potential information,which play a very important role in the final prediction index.Traditional methods integrate mixed frequency data into equal frequency data,which will lead to information loss and artificial data gap,the information in mixed frequency data is not fully utilized,additional errors will be brought to the model,and the non-linear and temporal features of mixed frequency data are not considered.With the advent of the era of big data,traditional methods can not meet the requirements.To address these aforementioned issues,this thesis proposes a hybrid model based on econometric method and long short-term memory network.The model can directly process the original data,and explore the nonlinear patterns and temporal characteristics of the mixed frequency data.The attention mechanism is used to adaptively extract the more critical features of the input features for the prediction index,so as to improve the training effect of the model.At present,the models used to deal with mixed frequency data only consider the original numerical data,and do not take into account the information of other domains.How to use the mixed frequency data information and the external information reasonably in time sequence has become a challenging task.Therefore,this thesis improves the model,introduces the external text information and makes use of it in time sequence.The improved model uses natural language process technology to vectorize the text,then integrates it into the model,and uses long short-term memory network to learn the temporal information and train together appropriately.By adding external text information,the real-time prediction accuracy of the model can be improved,and the model has the ability to deal with emergencies event,which makes the model have better generalization performance.The results show that the method presented in this thises is about 0.2 lower than the best performing comparion method in MSE.It illustratess that it is effective to explore the temporal features,select input features and use external information.Then,the method proposed in this thesis is validated,and it also provides a new idea for the processing of mixed frequency data. |