Font Size: a A A

Large-scale Study Of Protein-protein Interaction In Human Liver

Posted on:2010-07-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:L J TangFull Text:PDF
GTID:1100360275962301Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
Liver is the largest human gland. Its major functions include metabolism, bile secretion, participate in the formation and destruction of red blood cells, synthesis of plasma proteins and clotting factors, and detoxification, etc. The liver plays vital role during human life. Most of proteins execute functions through protein-protein interaction that play key roles in most cellular processes. Large-scale study of protein-protein interaction network in human liver is a way not only to understand the molecular regulation mechanisms of proteins and the internal law of each other, but also to find functional clues to unknown proteins or new functions of known proteins. It will be helpful to understand the molecular mechanisms of liver disease from network level, and discover new biomarkers and drug targets.In this dissertation, high-throughput yeast two-hybrid array platform was designed and set up. The protein-protein interaction of key protein in human liver was screened.The interaction dataset from Y2H screening was evaluated by a variety of methods. A first-draft map of human liver protein-protein interaction network was built. The characteristics of this network and possible biological significance were further analyzed. The dissertation is divided into two chapters:1. A first-draft map human liver protein-protein interaction network and its characteristics analysis. An automated high-throughput yeast two-hybrid array platform was designed and set up. High throughput techniques were developed, which include plasmid preparation, transformation of E.coli or yeast, and Y2H array screening. These techniques and the high throughput vector construction technique developed by Hubei University were integrated. The screening throughput reach 384×384 array per day, that is, about 150,000 possible interactions are detect every day. To build a total of 10,000 yeast two-hybrid colonies array on this basis, a matrix for systematic interaction mating was created. About 3500 bait vectors and 5100 prey vectors were transferred to yeast strains with different mating type, MATαand MATa. Those genes that had not passed the self-activation test were discarded. In the remaining clones, every 12 preys was formed a pool and screened against each bait by mating method. The diploid yeast clones that activated the LacZ reporter gene in the X-GAL assays were judged as positive. All positives in the first round of screening would mate with each of the 12 preys in the second round screening. Those clones could grow in SCIV (-Trp-Leu-His-Ura) activating the HIS3 and URA3 reporter gene and/or activated the LacZ reporter gene in the analysis of X-GAL were judged as positive clones. Only the interactions that passed two independent screening were considered as true positive interactions. More than 15 million pairs (matrix of 3000 bait and 5000 prey) were detected and a total of 991 protein-protein interactions among 939 proteins were identified.To evaluate the accuracy of the Y2H datasets, 384 randomly selected pairs were used for re-verification by re-mating in yeast. The positive rate is 87%. Then 40 randomly selected interactions were tested by a co-immunoprecipitation (co-IP) assay. Ten interactions were successfully tested, and 8 of them were positive. Although the number of confirmed data is small, it still suggests that most of protein-protein interaction data could be verified by other independent methods, with high reliability. Furthermore, a variety of bioinformatics analysis methods were used to evaluation our dataset, which include known interaction pairs in HPRD and PubMed, and analysis by web-tool PRINCESS. The 64 (6.4%) in 991 interactions were known, and 410 pairs of interaction were scored more than 2 points, which means that more than 40% interactions are high reliability. The ratio is higher than the ratio of two large-scale studies of human protein-protein interaction which have been reported. These results showed that our dataset is high reliability and the platform has high sensitivity.We presented the interactions in visible network graphs with Osprey network visualization system. The networks show that our data set is complement to the existent network. At the same time, the interaction of the newly discovered protein may provide hints to many new regulatory mechanisms and functions. In addition, we analyzed the topology of the interaction network. We found that the properties of the interaction network from arrays screening is similar to other eukaryotic networks, which is small world and scale-free. We attempt to mining the dataset on the three levels, that is, surface (the regulation network of an important signaling pathway), line (the potential cross-linking between two biological process), points (to reveal novel function or control mechanism of important proteins), and an example from each level is given. These results may provide some important clues for the functional analysis of the interactions.2. The mining of the protein-protein interaction of the metabolism-related proteins in human liver. Metabolism is an important feature of liver, and there are a limited number of proteins contained in the matrix. In order to obtain the interaction of preys that were not in the matrix, 27 baits vectors of the metabolism proteins were successfully constructed and screening of human liver cDNA library. There are at least more than 1×106 independent clones which were transferred with AD-Y in cDNA library in each screening. And 500 candidate positive clones were obtained. After sequencing, 221 prey sequences were obtained. And 109 sequences were in frame. After removal of redundant interaction, 73 different protein interaction pairs were finally obtained. To estimate the technical false positive rate, these interactions were verified by reassessment of the interactions in yeast cells. The total recovery for all interactions was 52.5%. With these results, it should be reasoned that there exist some technical false positives in our data set, but in study of other researchers, the positive rate of the known interaction is only 53.3%, indicating that the retest assay verifies the existence of a high false negative rate, therefore, in the follow-up analysis, we retained the negative interactions. To assess the credibility of the interactions, a variety of bioinformatics analysis methods are adopted, which include known interaction in HPRD and PubMed, gene co-expression, GO annotation, interaction domain, homology of model organisms, and network topology. There are 4 pairs of the interactions occurrence in HPRD; And 35 pairs of the interaction co-occurrence in the literature; 15 pairs of the interaction participate in the same biological process or have same molecular function; 8 pairs of the interactions have the interaction domains; one interaction has homology in model organisms. These results showed that our dataset is high reliability and effective supplement of the human liver protein-protein interaction network obtained by array screening.In this dissertation, an automated high-throughput yeast two-hybrid array platform is designed and set up. It may provide technical support to large-scale studies which include protein-protein interaction in different species of organisms or interaction in pathogenic protein and human protein. At the same time, The protein-protein interaction network in human liver obtained by this study not only revealed a number of functional clues of unknown proteins and new functions of known proteins, but also helps to understand the mechanisms of liver disease in the network level and discovers new biomarkers and drug targets.
Keywords/Search Tags:large-scale, protein-protein interaction, Y2H, liver
PDF Full Text Request
Related items