Font Size: a A A

Research On Deep Web Integration And Its Related Several Technologies

Posted on:2009-11-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:H X XuFull Text:PDF
GTID:1118360272459229Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Web's information can be classified into Surface Web and Deep Web according to the depth of the information. Surface Web means that the Web pages can be indexed by the traditional search engine for their hyperlinks in the Internet. While the Deep Web is defined as the content that can not be seen by the traditional search engine, those pages do not exist until they are created dynamically as the result of a specific query, Because traditional search engine crawlers can not probe beneath the surface, the deep Web has heretofore been hidden.It is a great challenge for information science and technology that how to organize and process large amount of Deep Web information. As the key technology in organizing and processing large mount of Deep Web, Deep Web integration can solve the problem of information disorder to a great extent, and is convenient for user to find the required information quickly. Moreover, Deep Web Integration has the broad applied future as the technical basis of information retrieval, search engine, personal service and so on.Research on Deep Web integration and its related technologies are done in the paper. Our primary works are as follow.(1) Study on the integrated Model of Deep Web systemFor the variation of user's requirement, different integrated model of Deep Web system should be considered. In this paper, we first study the relative model of Deep Web and the constraint, and based on which, different integrated Models of Deep Web system are presented, and their process flows are also discussed. This work can give reference for the further research and application.(2) A Machine Learning Approach Classification of Web DatabasesClassifying such structured sources into domains is one of the critical steps toward the integration of heterogeneous Web sources. In this paper, we present a deep Web model and machine learning based classifying model, and a novel weighting method is proposed. The experimental results show that we can achieve a good performance with a small scale training samples for each domain, and as the number of training samples increases, the performance keeps stabilization.(3) Ontology-based Query Interfaces Classification of Deep WebOntology is model based on knowledge, which is used to represent the terms, the relations and the rules of the conception in a machine readable format. In this paper, we present an Ontology-based query interfaces Classification, which includes a category Ontology model and a novel weighting calculation over Vector Space Model (VSM). The experimental results show that we can get a good performance(4) Study on Environmental Changes Processing in Deep Web Integration Based on KnowledgeBased on the research on the dependence of the components in the deep Web integration, a knowledge-based method is given to process the changes in such integration, which includes environmental changes processing model, a self-adaptive software architecture and algorithm. This method can provide a reference to the further research or toward application for the large-scale deep Web integration. The experimental results show that the method can not only process the changes, but also highly improve the performance of the integrated system.(5) Study on personal service over deep WebThe digital science references are usually provided as non-freed Deep Web. The scale of information can make the user puzzled and missed. The personal service over Deep Web can solve the problem to some extent, which can make the information themselves to "find" the needed users. In this paper, we propose a framework of personal service system over the Deep Web, which includes the user profile model based on the the meta-description of digital resource, the Deep Web crawler, and a novel pushing algorithm etc. At last, a personal service system over the selected Deep Web is developed, and with small number of user's intervene, the system can push the information to the users that they needed.
Keywords/Search Tags:Deep Web, Classification, Ontology, Knowledge, integration, Environmental Change, Personal Service, Crawler
PDF Full Text Request
Related items