Font Size: a A A

The Research Of Top-K Proximity Search Method For XML Data Based On Keyword

Posted on:2014-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:T LiFull Text:PDF
GTID:2248330395987044Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of network technology, XML is increasingly becoming thestandard of the data representation and the data conversion in Web. It has been widely used inmany aspects, such as E-commerce, financial and so on. For the exponential growth of XMLdocuments in the network, the retrieval of the XML data has attracted many researchers’vision. Keyword query has been paid a lot of attention because of its query friendliness.The paper firstly analyzes the strengths and weaknesses of the existing keyword querymethods and founds that most of the existing keyword query methods use the “and”semanticto return the results containing all keywords. But for the query users, it also makes sense toget the results containing part of the query keywords. Considering the structure characteristicsof the XML document, XML structure query can capture the users’query intention moreaccurately and get results with high query precision. So, it will have high practicalsignificance to combine the relevant knowledge of XML structure query with the keywordquery methods.This paper introduces the thinking of object-oriented and query relaxation to thekeyword query, and proposes a method for judging the query need of users through keywordsposed by users and the structure information of XML document. This method first considersthe structure characteristics of XML document, introduces the concept of objectification, andthrough the divide of XML document into different objects with some association with eachother. It then through the number of keywords posed and the position of keywords in XMLdocument to return the result-object set, and get the similarity result-object set with thesimilarity threshold less than U through the method of similarity object determinaion. At last,building the structure query over the similarity result-object set, and executing thecorresponding twig queries. Secondly, in the stage of query results scoring, we usecombination methods of content query scoring with structure-content query scoring to scoringthe results. First, geting the corresponding score of structure query considering the structuralfeatures of the query relaxation. Then we make attribute-elements as the processing unit, takethe user’s attribute preference into consideration, and get the relevant score of element in thecontent query. At last, we get the final score of results through the combination of these twoscores.On this basis, this paper proposes a XML keyword search method framework based onsemantic and structure relaxation. In this method, it can return the results including not onlyfull matching results, but also part matching results and approximate matching results.Through the effective ranking method, it can get the Top-K query results most relevant touesers’ needs. Finally, we do experimental comparison between the method proposed by usand the existing XML keyword search methods named XRank, MSLCA. Experimental resultsshow that the method proposed by us can capture the users’ query intention better, and has a high improvement in the query effectiveness and efficiency.
Keywords/Search Tags:XML, keyword, object, relaxation query, user preferences
PDF Full Text Request
Related items