Font Size: a A A

Research On The Ontology-based Information Extraction For Personal Homepage

Posted on:2013-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:S JiaFull Text:PDF
GTID:2248330371970104Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In the high-speed developing information times,more and more information carriers turn upwith the develop of Internet,so we have a large number of irrelevant information.The increase ofthe information has made that hoping that selecting the valuable information by hand isdoubtful.But in the information times the demand for information is very intense,it has been animportant topic of contemporary research to select the information of general interest in order tomeet their requirements for the information efficiency. All of these promote the development ofthe task of research on the information extraction,extraction tools have become the essential niceassistant now,and the information extraction techniques are developing all the time.However,thegreat increase rely on the internet,but due to the heterogeneity and the lack of structure of Webinformation sources,access to this huge collection of information has been limited to browsingand searching,so information extraction techniques used on the internet become an effectivesolution to ease the situation,relative to network data mining application of exorbitantmaintenance cost,information extraction techniques for the network are always remainingconverting the input pages into the unified data mechanically.Therefore,this paper adoptsinformation extraction techniques used on the internet to select the information items,builds themodel in the field of study,and designs the information extraction system to extract the pages.It is the general idea of this paper to introduce ontology to information extractiontechniques.The ontology serves as standardization description of concept and relation,it has aninborn advantage in the process of designing model,and the method of concept and relationprocessing is intensive,so with the framework of domain information,it will be more reasonableto extract sample information.When the ontology model serves as a powerful tool in this paper,itwill pass through a specialized process,and achieve satisfactory degree with good versatility andinteroperability to lower the dependence of information extraction task to the structure of theWeb.Through ontology has the description of the domain with a concrete sample example, both of them are indispensable basis for this paper.This paper’s network resources are on the personal homepages.Incorporating my academicbackground,professors’ homepages server as the sources of data to design this paper’sinformation extraction systems.First,this paper introduces the knowledge of personalhomepage,ontology and information extraction,then it makes the extraction strategy possible tocontrast and analysis structure characteristic of personal homepage,integrate ontology trait.Thispaper focuses on designing the ontology model apply to personal homepageinformation,developing and utilizing by means of ontology tools,this includes thedetection,inference and store for the ontology.Then to select the pages’ information by buildingextraction rules,using information extraction algorithm with ontology model.The design ofsystem interface is so easy that the users can extract the information in accordance with theirneed from the objective sites,they can also see all the description of information items.Designingthe ontology and information extraction rules is this paper’s core.Comparing the ontology-based information extraction method with others,experts in thisfield can define the concepts,relation,and the constraint among them and structural levels for thesame domain,and gain the extraction rules as the criterion of input document for the extraction.Interms of theory,because strong enough domain ontology can make the information extractiontask accurate enough,the research in this paper has a certain meaning for improving the completeand accurate rates.
Keywords/Search Tags:Personal Homepage, Ontology, Extraction Rules, Information Extraction
PDF Full Text Request
Related items