Font Size: a A A

Research On Key Word Query In XML Document

Posted on:2009-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:L L KongFull Text:PDF
GTID:2178360242489731Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
XML is a self-describing and extensible language, which specifies the contents as well as the structure information. There has been an exponential increase in the amount of the XML documents in Web pages on Internet, commercial text repositories, digital library and so on, and naturally, efficient information retrieval from these great amounts of XML documents is becoming extremely important.XML keyword querying is a hotspot for researching in XML data searching field. XML data carry out the keyword querying that taking element as the grain degree and only return the text file part including a keyword, which raised the search speed. Compared to the search language with XML such as XQuery, the main advantage of the keyword querying is the customer doesn't need to study complicated search language, nor need to have thorough understanding to the structure of the XML text file on first floor,the customer only needs to input the keywords related to his interested in contents.The main contents in the thesis: analyzed the research actuality of XML keyword querying, the categorization concept for keyword proposed in XSeek has been absorbed in my thesis, which inspire me to propose a new keyword querying method based on keyword categorization.This method categorizes keyword as predicate keyword and result keyword. Predicate keyword is used to restrict range of querying which will not occur in the result sets. While only the result keyword make its appearance in result sets. This idea makes much contribution to reduce the size of result sets. Define simple query grammar; bring forward a new query processing flow, making the categorized keywords play different roles in the course of querying as "structure-liked query". As a result, the accuracy of querying has been raised effectively. Define the notion of Similar Node Pair (SNP), bring forward a Similar Node Pairs Finding Algorithm to find SNP and a judgement of meaningful SNP, all this efforts focus on matching the right keywords. Name node index, value node index and main Dewey index are contructured to quicken the course of mutual search between node and its Dewey number. The experimental results demonstrate that the new query method based on keyword categorization do better in expressing the user's querying intention than the traditional keyword query.
Keywords/Search Tags:Categorization, Semantic Capability, Keyword Querying, XML, Segment of XML Document
PDF Full Text Request
Related items