Font Size: a A A

Keyword Query Over The Database Based On User Feedback

Posted on:2015-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhouFull Text:PDF
GTID:2268330431456827Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Vast volumes of structured data and semi-structured data exist in enterprises and on the Web. To explore the data in database, users have to understand how the information is stored in a relational database, and know how to specify their requests using SQL precisely, which is difficult for ordinary users.The success of keyword queries as a common way of Web search and exploration has spurred much interest in the research community in supporting effective and efficient keyword search in relational database. It allows information retrieval (IR) from the databases by simply giving a set of keywords, without requiring users to know either query languages (such as SQL) or the database schema. With keyword query over databases, users do not have to understand database schema and to learn a query language. What users have to concern about is how to use keywords to express the required information.In the literature of database keyword search, there are two modeling methods:the data-graph based approach and the schema-graph based approach. Recent work along the schema-graph based approach has attempted to improve the effectiveness of the search by ranking the final results with better designed scoring functions. However, there has been very few works explicitly incorporating user feedback into the ranking of query results. Using user feedback to enhance user experience has been extensively studied in the Web IR literature. Our work presented in this thesis taps into the wealth of information contained in user feedback, and aims to enhance the search effectiveness by explicitly taking into account the feedback information when ranking the query results. To concretize our discussion, we focus on schema-graph-based approaches to keyword search (using the seminal work DISCOVER as an example), which usually proceed in two stages, candidate network CN generation and CN evaluation. We propose a new ranking strategy that uses the frequent patterns mined from query logs, a form of user feedback, to help rank the CNs generated during the first stage. Given the frequent patterns, we show how to compute the maximal score of a CN using a dynamic programming algorithm and a greedy algorithm. We prove that the problem of finding the maximal score is NP-hard. User studies on a real dataset validate the effectiveness of the proposed ranking strategy.
Keywords/Search Tags:Keyword Search, Schema Graph, Candidate network, Query Log
PDF Full Text Request
Related items