Font Size: a A A

Selection Of Deep Web Database

Posted on:2010-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:J C FanFull Text:PDF
GTID:2198330302956157Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the Internet related technologies become more sophisticated, there are many rich online databases in the Web. Web databases contain information for the user is not visible, called Deep Web. The establishments of Deep Web data integration system can effective from the Deep Web to obtain valuable information. In order to improve the efficiency of Information retrieval and accuracy of query, Deep Web database selection is an essential aspect of Deep Web data integration system.On Web database selection, this paper is to do the two key research areas. The first is design representation of query features and access to Web database features. The second is for a specific query how to sort Web database.On the features of Web database, this paper builds the initial query sets based on domain knowledge. Because of the dynamic update of the query sets, with the submission of the user's set up a representative frequent query sets and submit to all local Web databases, by analyzing the location of query appear and contain the query words the relationship in the Web database's return results access to Web databases features.Sort the Web databases, classifies the obtained results from the Web database by their characteristic, the relevance between query words and each type of obtained results together with the proportion of each type of the obtained result occupying all the results from all the Web dataset. Meanwhile, the relevance between the query and the Web database is obtained through the weighted sum of the foresaid two parameters, in accordance with the correlation sort the Web databases. On the basis of sorting through the analysis and query results related to the growth rate to determine the selection of the threshold number of Web databases which was both efficient and accurate part of the Web databases. Finally for the users to submit a new query, select a step forward will be part of a Web databases as input, using Apriori algorithm to calculate maximal frequent item sets will be the most frequent Web databases as a new default Web database selection sequence. The experiments show that the method can effectively assess the Web database and evaluate the Web database's supporting power with specific queries.
Keywords/Search Tags:Deep Web database, Frequent query, Relevance, Apriori algorithm
PDF Full Text Request
Related items