Font Size: a A A

Research On The Domain-oriented Deep Web Query Interface Discovery

Posted on:2015-12-30Degree:MasterType:Thesis
Country:ChinaCandidate:Z X LiFull Text:PDF
GTID:2298330452451420Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The deep web refers to data that located beneath the surface network, the amount of dataand value far exceeds the surface network. Thus the reason, how to dig deep network hasbecome a hot topic, especially the Deep Web information integration research is particularlyimportant. The first step in the Deep Web data integration is to find the Web database, which isfind the query interface. Some of the most prominent technical difficulties are: First, theefficiency of web access to information contained query interfaces needs to be improved;Second, the query interfaces are in the form of the form exist, but not all forms are queryinterface, how to improve the accuracy of classification is also a serious problem.About the Deep Web query interface discovery there is two problems, this paper will dothe following work:First, the Deep Web research, including the Deep Web concept, scale, existence, accessmethods, research direction and content of this paper.Second, the query interface discovery technologies used past, including research on DOMparsing and heuristic rules that usually used, and then analyzes the main query interfacediscovery algorithms and compared..Third, for the field-oriented Deep Web query interface to obtain efficiency, this paperpresents a query interface discovery algorithms, including those based on single-threaded andmultithreaded algorithms, and comparing the test results show that the algorithm based onmulti-threading significantly enhance the efficiency.Finally, in order to obtain Deep Web query interface from forms correctly, on the basis ofprevious studies, we propose a heuristic rule-based K-Nearest Neighbor algorithm for thepurpose of identifing the Deep Web Query interfaces, in order to carry out experiments, the papermade a variety of ways from a number of areas for query interface and non-query interface, andresults show that the algorithm can significantly improve the Deep Web query interfacediscovery, the accuracy, at the rate of re-investigation and recall rate has improved significantly.
Keywords/Search Tags:deep web, query interface, multithreading, k-nearest neighbor algorithm
PDF Full Text Request
Related items