Font Size: a A A

Efficient querying of biological networks

Posted on:2017-08-28Degree:Ph.DType:Dissertation
University:University of FloridaCandidate:Hasan, Md MahmudulFull Text:PDF
GTID:1468390011969658Subject:Computer Engineering
Abstract/Summary:
Biological processes drive different functions in cells using complex interaction networks among molecules. We begin querying such biological networks by finding frequent subnetworks in a given collection of network topologies. We present a method named SiS that solves this problem efficiently and finds frequent subnetworks in metabolic networks and human transcription regulatory networks data set. We then find the alignment of a small query network with an arbitrarily large target network. The complexity of this problem grows exponentially with the number of nodes in the query network if confidence in the optimality of result is desired. We develop a color coding based method named ColT that reduces the cost of this bottleneck. It exploits the topology of the query and uses the color distribution in the target network to filter unpromising alignments without compromising the confidence in the optimality. ColT identifies topologically and functionally similar regions in the targets much faster. Scaling this problem to large query and target networks remains to be a challenge. We then develop a novel index structure that reduces the cost of the alignment problem. Our index structure maintains a small set of reference networks where each reference is a small, carefully chosen subnetwork from the target network. Along with each reference, we also store all of its non-overlapping and statistically significant alignments with the target network. Our index yields statistically significant alignments with a significant speed-up in running time over ColT. We observe that our method finds biologically and statistically significant alignments across multiple species. The topology of biological network evolves with time which renders these existing index structures obsolete. Creating a new index from scratch every time the network evolves is impractical particularly for large and highly evolving networks. We develop a new and scalable reference based index that is updated dynamically to efficiently solve the alignment problem in large evolving target networks. We observe that the cost of dynamically updating the index after a small topological change in the target network is negligible as compared to building the index from scratch. Our experiments on human development cell types show that dynamic update of index is fast and usefulness of dynamic index becomes more evident with increasing number of topological changes.
Keywords/Search Tags:Network, Query, Index, Biological, Statistically significant alignments
Related items