A Self-adaptive Cross-domain Query Strategy On The Deep Web

Posted on:2012-01-20

Degree:Master

Type:Thesis

Country:China

Candidate:Y J Li

Full Text:PDF

GTID:2248330395958148

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

As an increasing amount of Web information, the information in Deep Web becomes more and more. How to visit these databases as automatically as possible is the target of the current data integration of Deep Web.The data sources in Deep Web cover many different domains. The technique in domain-oriented data integration becomes well-rounded which brings about many domain-oriented data integration systems of Deep Web. In this thesis, we suppose all data sources in Deep Web have been clustered according to the domain, each cluster which integrates all data sources belonging to a domain corresponding with a global query interface. At present, cross-domain query has been a pressing need with the increasing number of Deep Web applications. What this thesis researches is how to meet users’need of cross-domain query.To solve this problem, this thesis proposes a system named cross-domain query automatically. It includes two parts:(1) find the correlation between different domains and construct a domain correlation model. In this step, we analyze the correlation of domains based on the query interface’s attributes of the data source and the attribute values. We present an algorithm to calculate the correlation between two data sources from two domains and to justify whether two domains are correlated.(2) When a user does a query in a global query interface of some domain we construct a query tree based on the domain correlation graph. Furthermore, we present a cross-domain query oriented Query Path Evaluating Model (QPEM) to rank and recommend top-k query paths to meet all possible query intentions.We use samples of Web databases as the basis of selecting databases. Firstly. we choose Web databases which meet the user’s query according to samples. Secondly, queries are sent to the real Web databases which are chosen in order to reduce query cost. Furthermore, the content correlation of data sources is also based on the samples. The QPEM is a cross-domain query oriented query path evaluating model to rank and recommend top-k query paths based on the correlation between data sources, the quality of the father data source, the outgoing degree of it and the quality of the son data source, the incoming degree of it.In experiments, we get a high correlation precision of data sources. What’s more, we compare user satisfaction of four standardization methods and evaluate the influence of query coverage to user satisfaction. The experiment results show the effectiveness of the method proposed in this thesis.

Keywords/Search Tags:

Deep Web, cross-domain, top-k, domain correlation

PDF Full Text Request

Related items

1	Research And Implementation Of Cross-Domain Sentiment Classification Algorithm Based On Deep Learning
2	Research On Cross-domain Object Detection In Remote Sensing Images
3	Research On Cross-domain Recommendation Method Based On System Correlation
4	Cross-domain Action Recognition Algorithms Via Deep Spatial-Temporal Network
5	Research On Cross-domain Person Re-identification Based On Deep Learning
6	Research On The Cross-race Face Recognition Algorithm Based On Deep Domain Adaptation
7	Named Entity Recognition In Cross Language And Cross Domain Situations
8	Design of a keyword spotting system using modified cross-correlation in the time and the MFCC domain
9	Cross-domain Transfer Recommendation Model Based On User Genre Preference
10	Research Based On Content And Graph Information Fusing For Cross-domain