Font Size: a A A

Statistical Inference Of High-dimensional Complex Data Networks And Node Attributes

Posted on:2018-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiaoFull Text:PDF
GTID:2350330533461929Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Firstly,this paper aims to study the correlation between two correlation networks(correlation coefficient matrix).And this paper has solved the following problem: on one hand,correlation test between networks,on the other hand,high dimensional networks contain unobservable latent variable structures.In order to solve the above problems,the singular value decomposition is used to obtain the feature matrix.At the same time,the feature matrix is constructed by using the Wilks' Lambda statistics to test the correlation between the networks.If the test shows that there is a significant linear relationship,the linear regression model can be structured by low rank factor.If the test results show the nonlinear relationship,the nonlinear model is used and the maximum likelihood estimation method is used to estimate the model parameter.For the missing values,the model can use matrix singular value decomposition method to restore the feature matrix and to fill the missing correlation in the network and predict the potential correlation.Secondly,inspired by the correlation property of the precision matrix under the normal distribution,the general continuity distribution of the samples by Gauss Copula model transformation converts normal distribution to identify the distribution of missing values.Through this step prediction,to fill missing value provide a reasonable and effective judgments and explanations.In this paper,we introduce the inverse regression model,the latent variable Gaussian Copula model of mixed data, solve the corresponding problem for the sparseness and mixed data of high dimensional complex data.Finally,based on the numerical simulation of the model under the small sample,the four experimental simulations of symmetric network regression coefficient,missing value filling,unknown node relation,and asymmetric network missing value filling prediction are found.The standard is smaller,and with the dimension increases,the model results are better.In the empirical analysis section,the four included the drug structure similarity,similarity of target protein sequence and drug-target interaction.It was found that the method was better than the contrast in the similarity test and the model prediction process,especially in the process of reducing the dimension Significantly.
Keywords/Search Tags:SVD, Wilks' Lambda statistic, Parametric Inference, Missing Data, complex data
PDF Full Text Request
Related items