Font Size: a A A

Semantic text mining and its application in biomedical domain

Posted on:2007-07-21Degree:Ph.DType:Thesis
University:Drexel UniversityCandidate:Yoo, IllhoiFull Text:PDF
GTID:2458390005984740Subject:Computer Science
Abstract/Summary:
A huge amount of biomedical knowledge and novel discoveries have been produced and collected in text databases or digital libraries, such as MEDLINE, because the most natural form to store information is text. In order to cope with this pressing text information overload, text mining is employed. However, traditional text mining approaches have several problems, such as the use of the vector representation for documents. In this thesis, we introduce a semantic text mining approach that can overcome the traditional problems. This approach consists of important text mining components. Those components are graphical representation method for documents that relies on domain ontologies, document clustering taking advantage of the scale-free network theory to mine the corpus-level graphical representation, text summarization, and a semantic version of Swanson's ABC model. The primary contributions of this dissertation are four-fold. First we introduce graphical representation method for documents that take advantage of domain ontology. Second, the semantic document clustering approach is unique in that it provides users with document cluster models from an ontology-enriched scale-free representation of a set of documents, which are the summaries for each document cluster, and which also explain document categorization. Third, in order to maximize the usefulness of document clustering, we introduce a text summarization approach that makes use of document cluster models. Finally, we introduce a semantic way to generate reasonable hypotheses based on evidence from biomedical literature using the complementary structures in disjoint literatures.
Keywords/Search Tags:Text, Biomedical, Semantic
Related items