Semantics Based Top-k Keyword Search Technology In Relational Databases

Posted on:2009-10-27

Degree:Master

Type:Thesis

Country:China

Candidate:H H Zhou

Full Text:PDF

GTID:2178360308978054

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the amount of available text data in relational databases growing rapidly, the need for ordinary users to search such information is dramatically increasing. Even though the major RDBMSs have provided full-text search capabilities, they still require users to have knowledge of the database schemas and use a structured query language to search information. This search model demands strict requirements for ordinary users.Inspired by the huge success of information retrieval (IR) style keyword search on the web, keyword search in relational databases has become a brand new research topic. Keyword search in relational databases have three challenges:(1) Answers needed by users are results assembled from joining tuples from multiple tables in the form of tuple trees. (2) More aspects should be taken to estimate query results' relevance to a given query. (3) Relational databases have much richer structures than text databases. Existing IR strategies are inadequate in ranking relational outputs.In this thesis, we first study tuples' semantic characteristics and their semantic relations and after that a new ranking function is proposed. Our ranking function encompasses current ranking principles, in addition, it proposes new metrics to measure query relevance.Four algorithms-Naive Algorithm, EBA (Early-stopping Block Algorithm), AEBA (Advanced Early-stopping Block Algorithm) and Global AEBA (Global Advanced Early-stopping Block Algorithm)-tailored to support our ranking function are proposed. The latter three algorithms process a chunk of data at one time and ensure minimal database probes, and can perform effectively as a result. AEBA is an improvement of EBA, which uses multi-table joining policy to further improve system's performance. EBA and AEBA deals with single Candidate Network, while Global AEBA uses a priority strategy to efficiently process multi-Candidate Network at a time. In addition, considering that current algorithms can not directly support our new ranking function, we modify the Sparse Algorithm so that it can be used to incorporate the new ranking function proposed in this thesis.Finally, experimental results are given to prove the adequacy of our ranking function and efficiency and effectiveness of EBA, AEBA and GAEBA.

Keywords/Search Tags:

Top-k, key word search, relational databases, information retrieval, tuple tree

PDF Full Text Request

Related items

1	Research On Tuple-Based Keyword Search In Relational Databases
2	Top-N Keyword Search Based On Knowledge Base In Relational Databases
3	Keyword Search in Relational Databases
4	Research On A Keyword Search System For Relational Databases
5	Research On Technologies Of Keyword Search Over Relational Databases
6	Research On Multi-Relational Decision Tree Algorithm
7	Multi-relational Classification Based On Decision Tree Algorithm
8	Areas Of The Theme-based Web Information Retrieval Techniques
9	Research On Fuzzy Keywords Search Over Relational Databases And Its Optimization
10	Research On Object-Level Retrieval Algorithm Over Relational Databases