Font Size: a A A

An Bayesian Mcmc Method To Detect The Homogeneous Effects In High Dimensional Panel Data Analysis

Posted on:2013-07-05Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2249330377954648Subject:Finance
Abstract/Summary:PDF Full Text Request
In real life, we tend to build many economic models to analyze the economic laws which hide behind the economic phenomena. Through a kind of mathematical language with vivid representation, we can simulate the complex relationships between influencing factors. These factors are always presented as economic variables, they can be divided into the explanatory variable and the explained variable according to the causal relationship between them. The actual values of these economic variables have many kinds of data types, panel data is the most widely used in the actual study. It includes a lot of individual observations over time and provides much of dynamic information which cross sectional data or time series data can’t provide. In panel data, there must be a certain heterogeneity, both the heterogeneity of the time and the individual. So, predictor may impact the response in substantially different manner. Some of predictors are in homogeneous on all subjects, but others are maybe in heterogeneous way. How to effectively differentiate these two kinds of effects very important, particularly under the high dimensional situation, which help people to more correctly understand and explain the patterns behind data. To this end, a novel yet effective Markov chain Monte Carlo (MCMC) algorithm is proposed in this article. It is not only especially suitable for the case of high dimensional panel data, but can estimates coefficient and selects model at the same time.Though the heterogeneous effect can be on time, the most interesting model in reality is to consider the heterogeneous effect vary over individual. In this article, we assume that the heterogeneity effect vary over individual, not over time. And consider this issue under the framework of a linear regression model. Whether the heterogeneous effect on individuals or on time, it can be solved by a similar method. As long as the model of correct settings and the data of reasonable arrangement can ensure that the conclusion will not change. When data do not support the hypothesis of constant slope coefficients, it would seem reasonable to allow variations in parameters across cross-sectional units in order to take into account of the between individual heterogeneity.Based on the existing literature of model selection, we set up a hierarchical Bayesian mixture panel model. The hierarchical here mainly refers that the coefficients of the explanatory variables in the model is a specified multivariate normal distribution. In this multivariate normal distribution, the elements of its covariance matrix are assumed to follow inverse gamma distribution. Using a simple Bayesian approach may lead a complex joint distribution of all parameters, but the hierarchical Bayesian model can avoid this problem. Here, we suppose the heterogeneous effect on individuals can be represented by a random coefficient. While facing real data set, people have no prior about the determinant of heterogeneous or homogeneous effect. As a start point, we tentatively assume that all variables are in heterogeneous effect. From a hierarchical Bayesian setup, we are able to make inference about the covariance matrix elements in the multivariate normal distribution. If the explanatory variable is in heterogeneous effect, the corresponding covariance matrix element would be substantially different from0; otherwise, the covariance matrix element would be close to0, which means a homogeneous effect predictor.Then, we propose a sampling procedure from the MCMC algorithm. By repeating the procedure, we can get the posterior mean estimate of the covariance matrix elements. For other parameters, we take the similar manner to calculate the posterior mean estimate. Particularly, for each data set, once we determined the predictors in homogeneous effect according to our criterion, we take the mean of all individuals as the estimate for as the estimate for the coefficient of the predictors. Based on the posterior mean estimate of the covariance matrix elements, we should make inference whether the j-th predictors is in heterogeneous or homogeneous effect. Obviously, a big value corresponds to the heterogeneous effect, and close to0means homogeneous effect. According to intuitive derivation, we put forward a criterion to determine a threshold which is used as cut value for the covariance matrix elements. Subsequently, our Monte Carlo simulation studies suggest that this simple criterion works well.To evaluate the performance of our MCMC algorithm, we conducted extensive Monte Carlo simulations. In these studies, we consider three different setting on sample size and predictor dimension. For each model, we simulated100data sets, and for each data set we sampling1000times. The first500times are treated as "burn in" period. That is, we take the last500times sampling series to maker inference.There are three types of relationship between the collect of predictor in homogeneous effect of true model and the model determined by our criterion, namely underfitted, correctly fitted and overfitted. We report the percentages of models correctly fitted, underfitted, and overfitted on the100data sets in table1. After that, we define the mean of absolute relative errors and evaluate the accuracy of the MCMC algorithm.In the field corporate finance and capital market, a firm’s ability to make profits is usually measured by return on equity (ROE). Many fundamental value analysts rely on it to judge the investment value of stocks. Thus, it is crucial to accurately predict ROE. As we know that the financial statements contain a large number information for firm’s operation and management. It is natural to build a model to predict firm’s ROE on next year based on current year’s statement. However, the interior relationship between ROE and statement could be rather complicated. Moreover, the statement consists of a huge amount of accounting variables. Hence, a usual model may not capture such a complicated relationship. This motivates us to imply our MCMC algorithm to investigate the relationship between next year’s ROE and current year’s statement.
Keywords/Search Tags:Bayesian Analysis, Hierarchical Panel Model, MCMC
PDF Full Text Request
Related items