Font Size: a A A

A Research And Application On Multivariate Data Mining Based On Statistical Inference

Posted on:2019-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:J Q WangFull Text:PDF
GTID:2348330563953949Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the explosive development of information technology represented by computer science in recent years,the era of "big data" has come.Researchers can gather massive data,and do substantial in-depth analysis and research on them.When orientating to data analysis,one of the most important issue is how to set up an analytical model of these massive data.Particularly,these massive data are becoming more diversified and complex.It is difficult for traditional data analysis models to take effect on highdimensional data or super large networks.The purpose of this thesis is to excavate structure information of networks from the similarity information and topological information of network nodes,using an analytical model combining multivariate data.We verify the effectiveness of its structure information mining ability,then apply our model to financial time series data.The application of multivariate data analysis model based on statistical inference can provide an effective analysis method for the structure information mining of time series and mixed data.The work of this thesis mainly includes the following two aspects:1)The basic concept of multivariate data model based on statistical inference,the proposed analysis model and its validity analysis.This thesis introduces the structure model,idea and definition of similarity matrix commonly used in complex networks,and gives the theoretical derivation process of the similarity matrix based on compressed sensing.Then we give the definition of metadata and combine meta data's label information in the analysis model based on statistical Inference and node network topology information.Analyzing effect of research model on community structure mining based on statistical inference of multivariate data in computer generated networks and real social network data.2)Research on the mining effect of network community structure based on multivariate data model of statistical inference in financial time series data.This thesis introduces two common financial datasets—S&P500 data set and the 2007 Shanghai Stock Exchange shares data set.We establish the similarity matrix of financial time series data through the algorithm based on the compression perception,and analysis the method of establishing adjacency matrix through the similarity matrix,then compare previous approaches to the establishment of node topological relationships in financial weighted networks.We compare and evaluate index of community mining results in low frequency financial time series data by traditional community detection algorithm in experiment.In addition,by using our model on the high frequency financial data,the structure information of the stock nodes in the Shanghai stock market is analyzed in detail.This shows that the stock in the same financial market taxonomy is not necessarily highly correlated,and the stock in the real strong association may come from different market classifications.
Keywords/Search Tags:Statistical inference, similarity matrix model, compression perception, community detection
PDF Full Text Request
Related items