Context - Based XML LCA Keyword Query Technology

Posted on:2012-04-15

Degree:Master

Type:Thesis

Country:China

Candidate:J H Zhu

Full Text:PDF

GTID:2208330434972943

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

As XML (eXtensible Markup Language) has become the leading standard for exchange and the information representation and exchange on the Web, there are growing demands for retrieving XML data. Since XML has many new features, how to retrieve XML effectively and efficiently has faced with a lot of new challenges and opportunities.Traditionally, a structured query language, such as XQuery and XPath, can convey complex semantic meanings and therefore obtain the desired retrieve results more precisely. Nevertheless, in many cases keywords style search is more easily accepted. For example, the structured query languages require user to know the XML document schema and how to express a query according to their complicated grammar. Most developed XML keyword search approaches are based on the idea of using a variant of lowest common ancestor (LCA) concepts. For each submitted query, they only retrieve nodes included in the subtree rooted at LCA node, while the others are treated as non-relevant to the user. But in fact, because the XML tree’s structural information is blind to the user and the keyword query typed by the user is so short that lack enough information to judge, the results based on the LCA can’t include all the relevant information in the subtree of LCA node in most of the time, which leads to user dissatisfaction with the query result. So how to improve the low effective performance suffered by many XML keyword search engines is the motivation of this paper and will try to be solved in the following.The main contributions of this paper are listed below:summaries the existing works and addresses the problem that only retrieve relevant information in the subtree rooted at LCA node, for which we propose the concept of LCA node based on context; Proposes a result expansion based approaches to define and get context information. The problems, which include how to decide whether the results should be expanded and which information should be added; to judge whether the results should be expanded, a decision rule which can balance both effectiveness and efficiency is proposed by analyzing query log; proposes an XML TF*IDF approach to score the candidate attributes, the name of candidate attributes are referred to those attributes which are not included in the LCA result, and the query expression is expanded based on the context information.In the experiments versus SLCA approach, whatever in the experiment date of precision, recall and F-measure, our approach has much better performance than SLCA approach. And the system response time is acceptable. The series of experiment data verify that we have achieved our goal.

Keywords/Search Tags:

XML, Context, Keywords Search, LCA

PDF Full Text Request

Related items

1	Context - Based XML LCA Keyword Query Technology
2	Key Technology Research And Implementation Of Vertical Search Engine
3	Research On Key Technology Of Internet Search Keywords Classification
4	The Research Of Search Keywords Suggestion Based On Web Mining
5	Research On Tourism Forecast Of Guangzhou Based On Search Indexes Of The Keywords
6	Study On Keywords-Based Approximate Search Techniques On Relational Databases
7	Research Of Removing Duplicated Webpages Algorithm Of Search Engine Based On Keywords
8	Narrowing down the semantic gap between content and context using multimodal keywords
9	A Study Of Large Scale Search Log Mining Based Context-Aware Search
10	Research On Outsourcing Data Security Retrieval Technology Based On Keywords