Font Size: a A A

Research On Ontology-based Product Information Extraction System

Posted on:2010-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:X H ZhangFull Text:PDF
GTID:2178360278975678Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with Internet and Infobahn is springing up, vast data files came into being. In the case, it has become an important problem that how to catch the interesting information exactly and rapidly in information ocean. Information Extraction (IE) is a new information management technology: basing on pre-definition template, it extracts special information from semi-structured data and unstructured data. IE system could help people find the information they need. Since the information has been analyzed and organized effectively, people could catch the information they interest easily.A new approach to extracting product information from normal document based on application Ontology is presented in this thesis. In the thesis, we analyse the system architecture, the taxonomy of Information Extraction, the key technology and weighing measure of Information Extraction, and introduce the main frame of the system and describe how to design and implement main modules, such as data structure, database and flow chart. Finally, the IE system is tested by a series of experiments. The extraction results have been analysed.After analyzing lots of Ontology languagies and knowledge in the computer domain, we construct a computer Ontology; consulting syntax parsing tool ApplePie, we design a new syntax analyzer which is used to extract information form semi-structured data and unstructured data. It can simplify complex documents and sort multi-semantic sentence; we present a new approach to extracting information: integrating IE and Ontology. Firstly, process pretreatment of document information-syntax parsing data; Secondly, make use of the concepts, relations and keywords of domain Ontology to generate Information Extraction rule; Then, use the result of grammar parsing and Information Extraction rule to do information extraction on document and at last output the result as a list of records. The thesis designed and implemented an Ontology-based product information extraction system in the approach. The results of IE experiment indicate Ontology-based information extracetion system can improve the F-measure which is reflection of Recall and Precision.
Keywords/Search Tags:Ontology, Information Extraction, Syntax Parsing, Extraction Rule
PDF Full Text Request
Related items