Font Size: a A A

Integrated Analysis Of Genetic And Proteomic Data

Posted on:2007-07-28Degree:DoctorType:Dissertation
Country:United StatesCandidate:David Michael ReifFull Text:PDF
GTID:1100581358672381Subject:Human Genetics
Abstract/Summary:Request the full-text of this thesis
In the present study, we present a working hypothesis that the joint analysis of genetic and proteomic data will provide more information for modeling disease susceptibility than either alone. In the context of the simulations performed, we conclude that the availability of multiple types of data is beneficial when the underlying etiological model is complex and one or more of the functional variables are missing. These results provide a baseline for those planning to collect and/or analyze genetic, genomic, and proteomic data from the same samples. This study represents a first step towards evaluating the merits of combining genetic, genomic, and proteomic data from the same samples for the detection and characterization of biomarkers of human disease susceptibility. From these initial simulation studies, we make the following recommendations. First, when the underlying etiology of the disease is likely to be complex, measuring multiple types of data is advantageous, especially if it is also likely that the technologies are limited in their ability to measure all biomarkers. Thus, we recommend that SNP data be measured in addition to gene expression and/or protein data. Second, we recommend that the multiple types of data be analyzed jointly. In the present study, a SNP-protein interaction was found when the etiological model consisted of two interacting proteins and one of the two proteins was missing for technical reasons from the datasets. It is interesting to note that the analysis of each type of data separately may also be beneficial. For example, in the case that the functional SNPs and the functional proteins are all present in their respective datasets, separate analyses may provide a type of cross-validation. That is, confidence in the inferences made about the functional biomarkers could be increased if the SNPs and proteins discovered through statistical modeling are related to the same set of genes. Finally, we recommend that additional simulations be carried out under a wider array of etiological models and dataset variations to fully evaluate the usefulness of the joint analysis of multiple types of data. These types of studies should prove invaluable to those planning to measure genomic and proteomic data from the same samples. The next five years will see the joint analysis of multiple data types become the standard, rather than the exception, in the study of complex human health and disease. Given the rapid expansion of technologies able to generate huge bodies of data, as well as their increasing acceptance in the biomedical research community, we anticipate real datasets appropriate for joint analysis will become increasingly common in the near future. The burgeoning field of research into high-throughput technologies will lead to continued improvements in cost-efficiency and reliability and make their use even more widespread. With these data in hand, joint analysis of multiple biological levels becomes a viable option. The notion that integration of multiple data types is the only way to truly represent a complex system flows naturally from the complexity revealed as biologists gain a deeper understanding of common disease etiologies.
Keywords/Search Tags:SNP-protein Interaction, Joint Analysis Simulation
Request the full-text of this thesis
Related items