Font Size: a A A

Information Extraction And Analysis Based On Plant Ontology

Posted on:2011-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:J ShiFull Text:PDF
GTID:2178360305474409Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Plant resources are important national treasure which can be sustainable used. Information contained in floras and plant guides which describing botanical features of plant are the guidelines of making good use of plant resources. However, this information is always loosely arranged in literatures and can hardly be utilized efficiently. Information Extraction (IE) technology aiming at acquiring structured information from natural language can be a good solution in resolving this obstacle. As the common understandings about domain knowledge, Ontology provides an efficient technique in overcoming the bottleneck of knowledge-engineering which are the main challenge that IE facing at. Here, our work is mainly aiming at extracting plant information from floras and plant guides based on constructed plant ontology.As a start, we reviewed the general knowledge about IE and Ontology, as well as the research progresses of plant knowledge-engineering and ontology-based IE.With a comprehensive comparison and analysis, we choose the technical route of"top-down", accompanied with"Seven-Step Method"and"ENTERPRISE Method"to construct ontology. The concepts in ontology were divided into events concepts and extended concepts. A comprehensive ontology used for IE latter was constructed through a series of steps, including verifying the extent of domain ontology, acquiring domain knowledge, establishing ontology frame (extracting event concept, refining extended concept, defining the relationship between concepts, increasing specific examples, etc.), formalizing ontology, evaluating & modifying ontology, and finally establishing the ontology.The Plant Information Extracting System was designed and accomplished through a method based on ontology and classification of information extraction. Firstly, we analyzed the plant ontology and stored them in database; Then introduced the concepts, examples and key words of plant ontology during the text pretreatment process, moreover, standardized, segmented words and tagged the text to be extracted; After that, categorized sentences in the text according to the classification rules, i.e. determined the events'classification of sentences (Category standard of sentence classification derived from the concept of events in plant ontology); Finally, selected the template for extracting combining the defined event category and extracted entities in the template according to the marked results. Classification and extraction rules in the system were all expressed with regular expressions. The test-results showed that the established plant information extraction system has a good performance.
Keywords/Search Tags:information extraction, ontology, plant, regular expression
PDF Full Text Request
Related items