Font Size: a A A

Research On The Deep Web Interface Schema Matching Based On The Machine Learning

Posted on:2013-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q JiaoFull Text:PDF
GTID:2248330377458929Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the World Wide Web, the information contained in thedeep web is increasing fast. As the deep web with the features of large-scale, heterogeneousand autonomous, it is a huge challenge to allow user efficiently and quickly to get their ownsatisfied information. Deep web query interfaces integration can solve this problem. Theinterface schema matching is the most important step in the steps of the deep web queryinterfaces integration. Only by this step the query interface can be integrated and get a globalquery interface. User can submit query information in the global query interface and get themost satisfactory results.This thesis studies the deep web interface schema matching. The purpose of the deepweb interface schema matching is to find the best attribute match between the partial queryinterface and the global query interface, and solve the different meaning of the same name ordifferent name synonymous phenomenon.There are many methods to solve the deep web query interface, but most of them are forthe partial query interface pattern matching and ignore the characteristics of the deep web.Based on the research of the partial query interface and global query interface’s schemamatching which is taking advantage of the characteristics of the deep web, a deep webinterface schema matching method base on machine learning is proposed. This method istransformed the schema matching problem into the machine learning classification. Themulti-strategy learning technology is used in this method, which has higher accuracy than thesingle learner. In order to enhance the accuracy of matching, the concept of domain ontologyis introduced in the training phase and matching phase. In addition, considering the queryinterfaces often contain a wealth of structural information, we propose a new learningalgorithm in order to take full advantage of a hierarchical tree of information. This processcan effectively correct the former matching results and enhance the final results matchingaccuracy.In order to assess the implementation of the deep web interfaces schema matching basedon machine learning, this paper takes120data sources as a training set and40data sources asa to be matched schema in the airline tickets and book sales area to experimental verification, the experimental results show that the method has a high accuracy.
Keywords/Search Tags:Deep web, schema matching, multi-strategy learning technologies, meta-learner, domain ontology
PDF Full Text Request
Related items