Font Size: a A A

Research On Ontology Based Approaches For Disease Data Integration And Mining

Posted on:2015-07-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:L ChengFull Text:PDF
GTID:1108330479478594Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, a large amount of research has been focusing on disease-related area, such as the integration of disease-related databases, disease similarity, and the relationship between terms of disease ontology(DO) and gene ontology( GO). The inconsistence of disease terms among different databases, the diversity of associations between diseases, and measuring literature-based relationships between terms result in the difficulty of integrating disease-related databases, calculating similarity between terms, and mining relationships between terms across ontologies, respectively. This dissertation focused on the difficulties and problems in disease research. The main content includes:(1) The approach for integrating of disease-related databases was proposed. There are many disease-related databases, and each of these databases focuses on association between diseases and one or two types of features. Owing to the lack of interaction between these databases, it is difficult to access a global view of disease. Two types of mapping including mapping with synonymous and mapping by inferring are used to tag synonymous relationship, and set inclusion relationship between terms, respectively. Based on these mappings, terms of disease-related databases are integrated into DO. Disease-related databases are classified by features of diseases. And features of diseases are converted into popular identifier. Then, the same records containing relationships between features and disease are ignored through comparing these identifiers. Based on the integrated database, associations between diseases and associations between features are mined.(2) The approach for calculating disease similarity based on gene association network was proposed. Approaches using association between genes is a hotspot of measuring similarity between diseases. Many types of associations between genes exist. However, only one or two types of these associations are used for this purpose. Similarity between diseases was converted to association score between gene sets of diseases according to this method. Then, the algorithm was desgined based on a comprehensively weighted human gene association network. First, the weighted edges of the network were normalized. Second, the association score between gene sets was measured based on the normalized network. Pairs of similar diseases extracted from literature were used as the benchmark set, which was exploited to evaluated the disease similarity method. The result of experiment shows that the performance of our method is prior to other methods.(3) The approach for calculating disease similarity by integrating semantic and gene association network was proposed. A comprehensively weighted human gene association network can be used to measure association score beteen disease-related gene sets. The number of genes involved in a pair of diseases and their common ancestors were exploited to weight semantic association score of the disease pair. The product of these two types of association scores were used to compute integrated disease similarity. The method was verified to be suitable for the assumption that similar diseases can often be treated by similar drugs. Meanwhile, a system based on this assumption is implemented for mining potential therapeutic drugs for diseases.(4) The approach for relating terms across ontologies based on literature was proposed. Terms occurring in literature are used to measure association between terms across ontologies. However, semantic associations between terms in ontologies are often ignored. And it makes against finding association between terms. Semantic associations of ontologies are used to extend the relationships between term and literature. These extensional relationships are exploited to weight relative score between terms. The method was used to mine association between terms across DO and GO. The result of experiment shows that the method performs very well.
Keywords/Search Tags:The integration of disease-related databases, disease ontology, disease term, disease feature, disease similarity, gene association network, semantic association, association between terms across ontologies
PDF Full Text Request
Related items