Research Of Web Biological Information Retrieval And Extraction Technologies Based Ontology

Posted on:2006-02-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y Cheng

Full Text:PDF

GTID:2178360212482701

Subject:Computer applications

Abstract/Summary:

With the rapid development of both Internet and biological information science, it is very important to find the biological information data sources in time. Due to ignoring the semantic information which keywords include, traditional search engine based on keywords obtains lower recall and precision. And so, it becomes gradually misfit for this requirement. Moreover, Web has been evolving a tremendous,distributed and shared information resources. But at present most of Web data are wrapped by HTML , which leads to applications indirectly reusing these Web information. So the technology of Web information extraction appears and solves the problem.In this paper, through researching semantic Web and Ontology technologies and making a whole study of information retrieval and semi-structured Web information extraction technologies, the author puts an emphasis on implementation of discovering biological information data sources and extracting biological information data. In order to discovering useful biological information data sources, the author presents a biological information retrieval system based on ontology and feature phrase. Meanwhile, the author also presents a method driven by ontology and locating the key information through the structure of documents and pattern matching to extract requiring data. The author has implemented an user-guided,interactive information extraction prototype system. Firstly it gets specified Web page, and convert the page into well-format XML document using HTML JTidy. Then through XML parser the XML document can be presented a DOM tree. The next step is to specify XPath expression by user to get the requiring data slot and extract data included in data slot by means of OntPMatch algorithm. Finally we can store the extracted data in a structured way.The paper has implemented a prototype system of discovering biological information data sources and extracting biological information data. So it makes users obtain more useful and satisfied information from Web than before and provides a valuable tool to make full use of Web tremendous data.

Keywords/Search Tags:

Ontology, Information Retrieval, Information Extraction, XML, DOM, Feature Phrase

Related items

1	Adaptive Web Information Extraction Method Research Based On Ontology
2	The Research And Realization Of Text Information Extraction Based On Ontology Applied In Intelligent Information Retrieval
3	The Method Of Fine-Grained Topic Information Extraction And Text Clustering Based On Chinese Phrase
4	Spatioteporal-Phrase Based Video Retrieval
5	Domain Ontology-based Web Information Extraction Technology
6	Ontology-Based Web Information Retrieval
7	Algorithm Research For Text Information Retrieval Based On Web
8	The Research Of Medical Information Retrieval And Extraction Technologies Based On Ontology
9	Unstructured Information Search Based On Ontology Semantics And Object Feature
10	Research And Application Of 3D CAD Model Retrieval Technology Based On Feature Information