Research On Semantic-based Approximate Query In XML Documents

Posted on:2011-02-08

Degree:Master

Type:Thesis

Country:China

Candidate:D L Yan

Full Text:PDF

GTID:2248330395957905

Subject:Computer application technology

Abstract/Summary:

XML (Extensible Markup Language) has been increasingly used in Web applications and becomes the standard of data interchange over the Internet. During the query processing, usersâ€™ intents are often ambiguous and incomplete, so that users cannot express their purposes accurately. In addition, XML data usually contain semantic information, including relationships of domain concepts and similarity information. Semantic information may play an important role in improving the performance of approximate query in XML documents. Nevertheless, traditional approaches require a determined query and then return all the answers satisfied the straightforward constrains. However, these answers are usually unsatisfactory because they cannot reflect the usersâ€™ intentions on the approximate semantic constrains. Therefore, it is important to discover the semantic knowledge and approximate relation of XML data in order to help users obtain the most relevant answers.This thesis proposes an approach to approximately query XML data with the assistant of semantic information. It proposes algorithms to extract semantic information from XML documents. Following the order of the importance of query conditions, it rewrites the initial query based on the semantic information and then obtains all the approximate answers. The whole process is divided into three parts.Firstly, effective algorithms to extract semantic information organized as ontologies and semantic trees from an XML document are developed. Ontologies provide a concise and unambiguous description of concepts and their relationships for a domain, while semantic trees are used to compute the similarity of text-type property values.Secondly, an algorithm to compute IDF scores of query conditions is introduced. According to IDF scores, the importance of each query condition can be calculated. Based on ontologies and semantic trees extracted form the XML document, it rewrites the initial query conditions. Specially, according to importance of each query condition, it proposes a set of query expanding rules based on the semantic information to expand the usersâ€™ query condition to the semantical equivalent or semantical approximate results.Then this thesis proposes an algorithm to delete invalid elements from initial query and adjust the query when the structural relationships are wrong for heterogeneity XML documents. An algorithm to relax structure restrains is also presented.Finally, we evaluate our approach against the existing work. The experimental results show that our approach is more effective. Concretely, comparing with existing methods, our approach has a remarkable increase in the recall rate and the precision rate of returned answers.

Keywords/Search Tags:

XML, semantic information, approximate query

Related items

1	Research On Ontology-based Approximate Query In XML Documents
2	Approximate Query Method Based On Relational Database Keyword Semantic Research
3	Research And Application Of Semantic Approximate Top-k Query Over RDF Knowledge Graph
4	Research Of Web Database Approximate Query Based On Semantic Similarity Computing
5	Study On Keywords-Based Approximate Search Techniques On Relational Databases
6	Research On (Îµ, Î´)-Approximate Query Processing Algorithms In Sensor Networks
7	Research On Ontology-based Semantic Query Techniques
8	Research Of Approximate Query Processing Technology For Large Scale Data
9	Research On Semantic Search And Related Technology
10	Semantic-based Query In Heterogeneous Information Integratiuon Environment