Font Size: a A A

Research On Keyword Search In Structured And Semi-structured Data

Posted on:2008-05-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:J J XuFull Text:PDF
GTID:1118360215484472Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Keyword search is now the most popular information discovery method because the user does not need to learn any query language, or know the underlying structure of the data. He only needs to input several keywords to express his information need. In the past decades, keyword search in the unstructured data has been well studied. With the increase of the amount of structured data (typically relational data) and semi-structured data (typically XML data), recently keyword search in the two kinds of data has attracted much attention. Based on the exsiting works, this dissertation lays emphasis on effectiveness and efficiency of keyword search in structured and semi-structured data. The main contributions of this dissertation are summarized as follows:1. The popular method of keyword search in relational data is based on search-time-join. This dissertation studies its core problem—search algorithm for the join expressions on the schema graph. It proposes a new search algorithm with the time complexity of polynomial delay, and gives its proof of correctness and analysis of time complexity.2. This dissertation proposes a new method of keyword search in relational data based on pre-join. Based on the analysis of the problems caused by physical scattering of different information parts, it gives the definition of Complete Tuple Graph (CTG) and regard CTG as the granularity of indexing and searching. Based on CTG, it designs the efficient search algorithm. It also put forwards the method of index maintenance.3. This dissertation proposes a new method of keyword search in XML data based on MIU. Based on the analysis of the problems caused by refinement of result granularity, it gives the definition of Minimal Information Unit (MIU) and presents the algorithm of partitioning the XML document into MIUs. Regarding MIU as the granularity of indexing and searching, it designs efficient index structures and the corresponding search algorithms. As for the above contributions, this dissertation gives the corresponding experimental data. The experimental results demonstrate the benefits of our method over previously proposed methods in terms of effectiveness and efficiency. These new methods not only have the promising future in scientific research fields, but also can be applied to the practical business applications.
Keywords/Search Tags:keyword search, structured data, semi-structured data, XML
PDF Full Text Request
Related items