Font Size: a A A

Research On Ontology-based Automatic Filling Forms Of Deep Web Entries

Posted on:2010-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:R LiFull Text:PDF
GTID:2178360272496996Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the proliferation of Web information, more and more information will be stored by the Deep Web database instead of the static pages. Compared with Surface Web, There is more information with higher quality in Deep Web. It is also the fastest growing information carriers simultaneously. Therefore, the study of Deep Web has become a hot spot of the searching direction. At present, the traditional search engines can only retrieve the static page links, but the massive valuable information resources hides in the WDB, the users can get the dynamic information from the WDB after submitting inquiries in the query interface. Faced with the domain related magnanimous query interface, it is obviously impossible to fill forms and submit inquiries artificially. With the hope of finding ways to resolve these issues, we try to fill forms of Deep Web Entries automatically, which is also the basis of selecting article topic.In this paper, I have conducted the thorough analytical study to automatic filling forms of Deep Web Entries, and propose solutions with the experiment. Through the following steps: the construction of domain ontology and generating a unified query interface with the guidance providing by ontology, the extraction query interface pattern information, matching map between schema information and ontology attribute, transforming query, we may simulate the user's operation and realize the automatic filling forms of Deep Web Entries. In summary, the technology involving in this article has covered four major directions of the Deep Web researching, including the schema extraction, pattern matching, unifying query interface, query transformation. About the above direction, the domestic research was still at the introduction imitation stage, originality is less theoretical, practical application of the Deep Web searching engine system is extremely rare. In the study of the topics, we can find that the current research has the deficiency generally, especially in the semantic understanding. It can be summed up in two features: the complete neglect of the semantic understanding in the Deep Web search, even if one involved in semantic the part,it also stops after getting a smattering. In other words , the method and the theory are not mature enough; Although another part of research consider the importance of the semantic understanding , but most of its research methodology is based on statistical machine learning methods, which can be further improved. In order to make up the lack of semantic analysis because of using pure statistics-based machine learning method, the paper used in the semantic understanding of the body has many advantages of technology to improve the accuracy of information retrieval. Ontology aims to capture knowledge in related fields to provide the common understanding of domain knowledge to identify areas of common recognition of the vocabulary, and different levels of formal patterns and vocabulary of these terms given the interrelationship between the clearly defined. Ontology has a good level of logical structure and the support of the concept through to express the relationship between the semantics, could be better on the text, web pages and other semantic aspects of data analysis. In summary, this paper will be the introduction of ontology Deep Web search technology, in the semantic level to solve pattern-matching, unified query interface, query the conversion and so on, and it has certain innovationThis article first form is automatically populated with the entrance to the overall structure of the basic functions of its sub-divided into four functional modules: ontology management module, pattern extraction module, ontology mapping module, conversion module query. (1) Ontology management module can be seen as the basis of the overall structure of the whole knowledge base of the prototype system mainly reflects the overall pattern of the area. The work of the module can be divided into two parts: semi-automatic construction of domain ontology; guide to generate and update a unified query interface. Semi-automatic construction of domain ontology in the ontology of self-learning ability, that is, ontology management process in matching body's ability to amend; generate and update a unified query interface based on the weight of the main value of the method, select the bulk value the concept of the highest frequency N of the former generation of the corresponding uniform query interface model. (2) Schema extraction module is a prerequisite for automatic filling, as can be taken through the mode of entrance into a machine-understandable form of the model set, and its direct impact on the accuracy and effectiveness of follow-up work to fill. In the module, first of all, the logic level from the entrance of the definition of a model form of information, and then on the page analysis of the visual characteristics of the final analysis based on the above heuristic rules as well as the corresponding extraction algorithms. (3) Ontology mapping modules form the main entrance to the establishment of models and ontological knowledge of the functions of mapping, and finally to the local interface - Ontology mapping form for inquiry as the output conversion module call. In the establishment of the process of mapping, this paper presents a matching algorithm is divided into direct and indirect matching both cases, and in an indirect conditional branch match record has been amended under the body of basic information. Save this information were to be judged in the concept of form and concept of the candidate sub-table for the ontology management module call. (4) Ontology Query conversion module for the follow-up process mapping module, the module input of the mapping table on the step, through the analysis of the mapping table to determine the appropriate type of pattern matching, in this basis, re-query processing problem for a variety of matching unified user query is converted to local queries, and ultimately filled the function of the form of local importers. Chapter V of this article describes some experiments and analysis of experimental results. Through experiments in this paper can prove the depth of network auto-fill form the entrance of the method is feasible and effective. Of course, any method can't be perfect, and all need constant improvement and perfection. Therefore, in the last part of this article, there is an analysis of the shortcomings of this method, and the direction guidance about the future researching.This article hopes to provide a framework for effective ideas and solutions on the issues of automatic filling forms of the Deep Web Entries. This selected topic foundation is the major technology in the Deep Web searching domain .This paper has also had certain practical application value, with the fundamental value of theoretical research.
Keywords/Search Tags:Deep Web, Ontology, Schema Extraction, Matching, Query Translation
PDF Full Text Request
Related items