Font Size: a A A

Research On Novel Biclustering Analysis Method For MiRNA-targeted Gene Data Based On Parallel Graph Autoencoder

Posted on:2022-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2480306332457904Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The traditional clustering method is to classify the samples with similar attributes or features according to the similarity of samples.However,with the increase of the number of samples and features,the time consuming of single clustering method becomes unimaginable.On the other hand,the internal data components of large data sets are complex and often sparse,so the single clustering method can not deal with the noise interference of large data sets well.The single clustering method based on global search of low dimensional data is not suitable for the clustering problem of high dimensional data and large data.In order to overcome the shortcomings of single clustering method,biclustering method came into being.Unlike traditional single clustering method,biclustering algorithm is based on the local relationship between samples(rows)and features(columns)and works simultaneously on two dimensions of samples and features.In recent years,biclustering method has developed rapidly and has been widely used in gene analysis,text clustering,recommendation system and other fields.Biclustering method is often used in the analysis of gene expression data in bioinformatics.Compared with the expression data such as gene expression data,there are more relational data in big biological data,such as mi RNA-targeted gene data.At present,most of the biclustering algorithms are designed for expression data,but the biclustering algorithm for binary relational data is rarely discussed,because there is very rich mathematical information in expression data,while relational data only contains 0 / 1relational information.To mine the biclustering module in relational data,based on soybean mi RNA-targeted gene data(including the relational matrix of mi RNA-targeted gene and the graph data of target genes),this paper proposes a biclustering algorithm GAEBic based on graph autoencoder,innovatively constructs a parallel graph autoencoder model PGAE to capture the relationship between target genes in mi RNA-targeted gene data,and proposes a novel irregular clustering strategy Bi GAE to mine more biologically meaningful biclustering modules.The input of PGAE model is the graph data of target genes and the attribute data of target genes.The structure of the PGAE model mainly includes encoder and decoder.Based on the graph data of target genes,this paper compares the performance of PGAE with Deep Walk,LINE and Node2 Vec in the prediction task of target genes' connection.The results show that the performance of PGAE model is significantly better than these of Deep Walk,LINE and Node2 Vec,and PGAE model has reliable learning ability of graph embedding.The input of the Bi GAE algorithm is the binary relation matrix of mi RNA-targeted gene and the embedding matrix of target genes.The embedding matrix of target genes is used to measure the similarity between target genes.Bi GAE takes the embedding matrix of target genes as the measurement matrix,and introduces the non-zero coverage rate as another index to ensure that the biclustering module is irregular and more in line with the biological significance.Based on the relationship data of mi RNA-targeted gene in soybean,we compared GAEBic with Spectral Biclustering,Bibit and Bimax algorithms.After the GO enrichment analysis of the biclustering results of the four biclustering algorithms,it is found that the GO enrichment rate of GAEBic's biclustering results is significantly higher than those of the other three biclustering algorithms,and the performance of GAEBic's biclustering results is also better than those of the other three biclustering algorithms.The GAEBic algorithm discusses the application of biclustering analysis in the field of binary data,and provides a feasible research idea for later researchers in biclustering analysis.
Keywords/Search Tags:biclustering, graph autoencoder, miRNA-targeted gene, binary data
PDF Full Text Request
Related items