Font Size: a A A

Research On Mixed Frequency Forecasting Model Based On Textual Data And Its Applications

Posted on:2021-07-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:C WangFull Text:PDF
GTID:1488306464958169Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The mixed frequency prediction model based on textual data is a new model proposed in this paper.It refers to the situation where the time statistical frequency between the independent variables is consistent(same frequency)or the time statistical frequency is inconsistent(mixed frequency)when there is unstructured text data in the model.The purpose is to solve the problem of unstructured textual data and mixed frequency data in the management field in the current era of big data.With the rapid development of information technologies such as mobile Internet and cloud computing,the types of data acquired are abundant,and the acquisition cost and transmission cost are also constantly decreasing.The form of data is not only inconsistent with the statistical frequency of time,but also textual data.In the actual background,the factors that affect the prediction results are often in the form of mixed frequency data and textual data.For example,in financial markets,monthly market volatility is influenced by daily and weekly trading information and monthly macro information;secondly,it is also influenced by text data,including unstructured data such as news,company financial announcements,forum posts,etc.,which itself can provide further insight into trends and sentiment fluctuations in the market,So how to make the best use of the data type inconsistency problem and the time statistics frequency inconsistency problem is one of the most pressing issues for companies and researchers today.Summarizing the current research on mixed frequency problems,there is still room for improvement:(1)Due to inconsistent information acquisition channels,textual data often appears,which makes existing mixing models unable to use textual information to study prediction objects.(2)While research on univariate independent variable mixed-frequency prediction problems has been studied,the non-linear and complex relationships that exist when the dimensions of the dependent variable and independent variable data do not match have been fully explored.(3)Correspondingly,in the study of multivariate mixed-frequency prediction models,as the textual data appear,the mixed-frequency relationship between the multivariate independent variables and the dependent variable becomes more complex,and there are also inconsistencies in the time statistical frequencies among the multivariate independent variables,which leads to limitations in the treatment of existing models.Therefore,given a semantic vector model can restore more semantic information,MIDAS model offers a new perspective to deal with mixedfrequency,long short-term memory network can effectively solve the problem of various nonlinear time-series data set,this paper tries to integrate the advantage of three theories,focus on the prediction model of mixed-frequency with text data and its application research,so as to solve the problems of time statistical frequency inconsistency between variables and unstructured text characteristics in the prediction research.The main work of this paper is discussed from the following three aspects:First,the forecasting model of univariate mixed frequency long short-term memory network is construed.In the current management practice and economic forecasting,there are a lot of mixed frequency problems in which the time statistical frequency is inconsistent.Based on the distributed lag model,the existing MIDAS model uses polynomial function to aggregate and average the high frequency data directly,which solves the information loss caused by subjective processing of mixed frequency data.However,the existing models are more and more significant with the nonlinear characteristics of the data,which leads to the large deviation of the prediction results.By introducing the theory of long short-term memory network and combining the idea of mixed frequency data sampling model,a forecasting model of univariate mixed frequency long short-term memory network is constructed.The proposed model integrates the MIDAS model and the long short-term memory network,and gives the process of parameter optimization and solution.Finally,it is applied to the empirical analysis of stock market volatility.Through the test results,it is found that this model has more advantages than the existing univariate MIDAS series models.Secondly,on the basis of the previous research,this paper constructs samefrequency multivariate mixed frequency prediction model based on textual data for the first time.How to make full use of the mixed frequency data with text to predict is the focus of current researchers and more in line with the practical needs.Existing mixed frequency prediction models are usually based on structured data,while text data are often unstructured data.How to accurately and efficiently extract effective text information and investigate its corresponding predictive ability becomes an important issue.Therefore,in order to solve the above problems,this paper,based on the idea of MIDAS,integrates the long short-term memory network and semantic vector model,and constructs the same-frequency multivariate mixed frequency prediction model based on text data for the first time.In this model,the time statistical frequency of multiple independent variables is consistent,and there is text data in independent variables,but the independent variables are in a mixed-frequency relationship.At last,the model is applied to forecast the stock index volatility in different markets,and compared with the mixed-frequency prediction model based on numerical structure,which proves the applicability and superiority of the model.Thirdly,on the basis of the first two works,this paper finally constructs a mixedfrequency multivariate mixing frequency prediction model based on text data,which is also the biggest innovation of this paper.In the previous work,the interdependent variables of the mixed-frequency prediction model for the basic textual data were samefrequency data.However,in the actual management prediction problem,the available data types are becoming more and more abundant.There is not only textual data among variables,but also the phenomenon of mixing among independent variables.The existing multiple mixing MIDAS model cannot directly solve the frequency inconsistency between the independent variables,and there is text information in the independent variables.Therefore,this paper finally constructed a mixed-frequency prediction model based on text data,and applied this model to the research of stock market stock index volatility prediction.Experimental comparison shows the effectiveness and feasibility of this model.This model is applicable to the situation where there is text data in independent variables and there is mixed-frequency relationship between multiple independent variables,which further fills the research scope of frequency mixing prediction model and has high innovation and practical significance.
Keywords/Search Tags:Textual data, Mixed Frequency Data Forecasting, MIDAS series model, Long Short-term Memory Network
PDF Full Text Request
Related items