Font Size: a A A

Hardware Research On The Measurement System Of Ring Laser Resonant Cavity

Posted on:2012-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:2178330332989402Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The main purpose of this paper is to provide a prototype information extraction system, based on the information extraction theories, especially the information extraction theories and techniques of HTML, which can be used to compare prices of different medicines from different web site.The paper's content are described in detailed subsequently. The related theories and tools of information extraction are introduced first, including information extraction based on the model of natural language processing, information extraction based on wrappers inductive approach, information extraction based on the mode of ontology, information extraction based on the structure of HTML, information extraction based on the web search process. And some knowledge representation methods in AI (Artificial Intelligence) field are introduced after that, including method of conceptual diagram, object oriented method, method based on rough set theory, method base on XML, method based on Petri-net, method based on frame structure. Subsequently, the main basis of the prototype system——information extraction theories and techniques of HTML and regular expression techniques are discussed in detail. Based on the discussion, the author gives a detailed introduction of a prototype information extraction system based on the information extraction theories and techniques of HTML, and also gives some examples of the executing results. The last part of the paper is the conclusion of the paper and some suggestions to further work.The major achievement of this paper is a prototype information extraction system, which includes two modules:1) Background information extraction module, which can support three methods of information extraction, includes three main functions:(1) Extract information from the web pages of the same format.(2) Extract information from the web pages of the different format, and save the information to the same table.(3) Grabbing web pages by a web spider program, and extract information from the downloaded pages.2) Front information query module, can be used to query the information extraction result and display them in a table.
Keywords/Search Tags:information extraction, knowledge representation, regular expression, web spider
PDF Full Text Request
Related items