Font Size: a A A

Semantics Based Top-k Keyword Search Technology In Relational Databases

Posted on:2009-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:H H ZhouFull Text:PDF
GTID:2178360308978054Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the amount of available text data in relational databases growing rapidly, the need for ordinary users to search such information is dramatically increasing. Even though the major RDBMSs have provided full-text search capabilities, they still require users to have knowledge of the database schemas and use a structured query language to search information. This search model demands strict requirements for ordinary users.Inspired by the huge success of information retrieval (IR) style keyword search on the web, keyword search in relational databases has become a brand new research topic. Keyword search in relational databases have three challenges:(1) Answers needed by users are results assembled from joining tuples from multiple tables in the form of tuple trees. (2) More aspects should be taken to estimate query results' relevance to a given query. (3) Relational databases have much richer structures than text databases. Existing IR strategies are inadequate in ranking relational outputs.In this thesis, we first study tuples' semantic characteristics and their semantic relations and after that a new ranking function is proposed. Our ranking function encompasses current ranking principles, in addition, it proposes new metrics to measure query relevance.Four algorithms-Naive Algorithm, EBA (Early-stopping Block Algorithm), AEBA (Advanced Early-stopping Block Algorithm) and Global AEBA (Global Advanced Early-stopping Block Algorithm)-tailored to support our ranking function are proposed. The latter three algorithms process a chunk of data at one time and ensure minimal database probes, and can perform effectively as a result. AEBA is an improvement of EBA, which uses multi-table joining policy to further improve system's performance. EBA and AEBA deals with single Candidate Network, while Global AEBA uses a priority strategy to efficiently process multi-Candidate Network at a time. In addition, considering that current algorithms can not directly support our new ranking function, we modify the Sparse Algorithm so that it can be used to incorporate the new ranking function proposed in this thesis.Finally, experimental results are given to prove the adequacy of our ranking function and efficiency and effectiveness of EBA, AEBA and GAEBA.
Keywords/Search Tags:Top-k, key word search, relational databases, information retrieval, tuple tree
PDF Full Text Request
Related items