Font Size: a A A

Analysis of network type data using statistical methods

Posted on:2012-03-09Degree:Ph.DType:Thesis
University:Boston UniversityCandidate:Yang, ShuFull Text:PDF
GTID:2458390008991939Subject:Biology
Abstract/Summary:
In many scientific research fields, networks have been widely used to represent or analyze a system. Network type data are high dimensional measurements that either are obtained from a network system or can be represented in terms of networks. In recent years, there has been a rapid growth of high throughput data that can be viewed as network type data, and statistical models have become the major tool for analyzing network type data. In this thesis, I explore network type data from three aspects using different statistical techniques.;First, I study the accurate detection of sources of perturbations in a complex network. In networks, the interactions between nodes make it difficult to localize the sources of the external perturbations. To detect the perturbation source, a 'network filtering' system can be used to filter out this interaction effect. However the theoretical rationale behind this system is still a mystery. To this end, I present a theoretical characterization of why and when 'network filtering' can detect external perturbations accurately. Then I study the implications of the conditions in the context of various network topologies through simulation studies.;Second, I explore the use of network topology to design multi-scale clusters on graphs. Specifically, I adopt the framework of 'diffusion wavelets', and specially modify it for graphs with heavy degree distribution. Based on scaling functions from the diffusion wavelets, I define a collection of vertex subsets across multiple scales on graphs, and apply it on a yeast protein interaction network to obtain multi-scale gene sets. The effectiveness of our gene sets is justified by comparisons with standard gene sets and through an application to differential expression analysis.;Thirdly, I explore the optimal design of perturbations in order to get observations that can help us study networks more effectively. Under the same model used in my first project, I propose an optimal design framework that can model the whole network. I begin with the perturbation design for individual units, and then study the general cases with several approximation methods. Simulation studies have been done to validate my design methods.
Keywords/Search Tags:Network type data, Statistical, System
Related items