Font Size: a A A

Domain Ontology-based Web Information Extraction Technology

Posted on:2009-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:L BiFull Text:PDF
GTID:2208360242493654Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of World Wide Web (WWW), information grows rapidly and internet has become the most important source which people obtained information from. The internet presents a stunning variety of online information resources. The amount of data accessible via the Web is staggeringly large and growing rapidly. Web Information Extraction is a effective way for web information integration. the Web information is usually formatted for human beings rather than machines, and no provision is made for automating the process. That is, the Web's browsing paradigm does not support many information management tasks.Web Information Extraction is samilar to the Web Informaton Retrival, the main purpose is to abtain the information which users need. But there is much difference between them. Different from Information Retriaval which returned the relational web pages directly, Web Information Extraction produced the different form web resauce to the same structure. It supports the Data Mining, new search engine and special search engine very much. Information Extraction can be trained as the deepening of Information Rerival. Reaserch on the finding, understanding and extracting of specified information, and return the result in a sutable form.In this paper, we will argue the key issues of the research of these topics.1. Electronic business has fueled increasing research interest recently in business information extraction and marketintelligence management. The information can be spread on the web expeditiously and conveniently, thus more and more companies and individuals public their information of products on it,such as cars and real estate. These information is formed well, but the structure of them between deferent sits'is dissimilar completely. This paper propound a method to extract their features using Domain-Specific Ontology and produce it, in order to satisfy the request of people. Further we do some experiments about real estate informaton.2. While some business information on theweb is not showed as a tabular structure, instead of it it is present as free-text, these information can not be extract by above system. So combined with the free-text characteristics, do some research on the free-text extraction.3. Besides, discussed the theory of attributes reduction of Domain-Specific Ontology, in order to realize the optimization of Web information extraction.
Keywords/Search Tags:Information extraction, Search Engine, Data Mining, Information retrieval, Domain-Specific Ontology
PDF Full Text Request
Related items