Font Size: a A A

Studies On Technologies Of Flexible Query For XML

Posted on:2013-08-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:W YanFull Text:PDF
GTID:1228330467482768Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the World Wide Web, semi-structured data consequently gained growing attention and XML became the de-facto standard for exchanging information as well as integrating heterogeneous data sources over the Web. XML as a data format differs from other document formats in that it has structure information besides content information. When searching the XML documents, the user also often has insufficient knowledge about structure and content information of the XML documents, thus the user frequently obtains empty answers or has to reformulate the query expression several times. In order to avoid empty query results, some query relaxation methods are proposed. The basic idea of the query relaxation is based on reducing the constraints on the original query to expand the scope of the query, where the most relevant query results return to the users. After relaxing original query, another problem faced by the users will be that there are usually many answers returned to users. To resolve the many answers problem, an efficient ranking method, which can rank the query results, is proposed. Moreover, the users often have fuzzy or imprecise requests when querying the XML documents. The user might like to issue the fuzzy query which consists of fuzzy terms or fuzzy relations for possibly retrieving. Therefore, how to extend system functions as well as make systems able to satisfy some user’s needs closely is an important issue. From the above, providing some flexibility to the XML query language can help users to improve their interaction with the systems.In recent years, many researches have devoted to investigate the technologies of flexible query for the XML databases, which mainly include XML query relaxation, query results ranking and fuzzy query, etc. However, most of thses approaches don’t consider the user’s preferences when relaxing the original query. In the real application, the efficiency of the query relaxation is affected greatly by the user’s preferences. For this, to deal with the problem of personalized query and fuzzy query that occurred in querying the XML databases, this paper proposes efficient technologies of flexible query to satisfy the users’query needs and preferences. The main contributions of this paper include:(i) To deal with the problem of personalized query, a method of relaxing contextual preference is proposed, that is, preferences whose query results depend on the context at the time of their submission. Context is modeled as a set of multidimensional attributes. Firstly, an XML contextual preference model is proposed. Moreover, relaxing context operators, which may be produced by relaxing one or more of its context attributes, are discussed. Furthermore, contextual preferences are stored in a data structure, called the profile tree. Finally, preference degree of contextual preference using association rules mining in the profile tree are obtained automatically,(ii) In order to resolve the problem of the empty or many answers returned from the XML databases, based on XML structural preference relaxation and contextual preference scoring, a query results ranking method is proposed. Firstly, a definition of structural preference, where all the possible relaxing queries are determined by the structural preference, is proposed. Moreover, the users express their interests on XML attribute nodes, and then users assign interest scores to their interesting nodes for quickly providing best answers. Furthermore, a preference query results ranking method, which inclules a clusters merging algorithm to merge clusters based on the similarity of the context states, a finding orders algorithm to find representative orders of the clusters, and a Top-k ranking algorithm to deal with the many answers problem, is proposed.(iii) Users often have fuzzy or imprecise requests when querying XML documents. Based on XML structure and content, a method for reflecting users’fuzzy query intention is proposed. Firstly, based on the fuzzy set theory, a fuzzy extensional method of XPath query expression, which can be expressed exploiting fuzzy predicates, is proposed. Moreover, based on algebraic operations, a novel method for expressing user’s fuzzy query intention is proposed. Its goal is to define a set of fuzzy algebraic operations, which can support fuzzy query in XML. The fuzzy query results can perfectly respect the fuzzy query conditions. Furthermore, a novel ranking method, which considers the relevance between the membership degree and user-defined weights, is proposed. Finally, an efficiently method, which computes the Top-k answers of fuzzy query results, is proposed.(iv) To deal with the problem of empty answers returned from the XML databases in response to users’fuzzy query, a fuzzy query relaxing method, which can get more querying results to satisfy users’query requests, is proposed. Firstly, the original fuzzy query condition may translate into precise query interval. The value of this interval may satisfy users’fuzzy query intention. Moreover, based on extended vector space model, a method for measuring the relevance between XML attribute node and fuzzy predicate, is proposed. Finally, based on improved PIR approach, a ranking method, which considers the relevance between the nodes specified by fuzzy query and the nodes unspecified by fuzzy query, is proposed.
Keywords/Search Tags:XML, structure and content, contextual preference, query relaxation, algebraic operations, fuzzy query, query results ranking
PDF Full Text Request
Related items