Font Size: a A A

The Measurement Of The Structural Similarities Of XML Document Graphs

Posted on:2009-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2178360245954717Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
XML (Extensive Markup Language) is another hyper text markup language which is advanced by W3C after the HTML. More and more information exchanges are depending on xml ,and many software need to search,adopt and deal with correlative XML documents and return the approximation. Therefore the comparability estimate of XML has got increasing attention. The XML document structure is not like a tree simply. It accomplishes all the typical hyper text interlinkages, that is, XML documents can't be described as trees, but graphs along with the spreading description capability of XML. The available methods of comparing XML tree structural similarity and graph structural similarity ignore the document structural characteristics, hence the result has great differences with reality. This paper proposes a method to measure the structural similarity of XML document graphs to solve this problem. It adds the links to the document structure and uses the document graph to compare the structure similarity. This paper will pay attention to the following aspects:First ,this article introduces the conception of XML document,its peculiarity and application domains. It analyses XML document graph description,impersonal inevitability and the abroad application.Second, this article puts forward a new method after the presentation of the exited methods. It begins with the graph structure of document to accomplish the costs for changes by translating the graphs to trees to set weights for nodes and edges. Then it depicts the document graph similarity.Then, this article analyzes the feasibility of this method by giving an example:1. The variety of association between the nodes could be embodied by similarity degree;2. It discovers the root node from adjacent matrix, and determines the position and numbers of root node.3. It begins from the root node to establish the tree structure, and endows the edges and nodes with the corresponding weights, which provides the basis for calculating the operation cost.4. It completes the inserting and deleting of the nodes and edges by matrix conversion, and it also calculates the costs.5. It calculates the structural similarities using the formula advanced in this paper. At last , the result of example shows that our method answers for people's judgement standard better than element comparison,edge comparison and edit distance methods, and it reflects the similarity of XML document graphs actually. Also, it gives the plan to perfect this arithmetic.
Keywords/Search Tags:XML document graph, structural similarity, eXtensible Link Languag(eXLL), adjacent matrix, relational matrix
PDF Full Text Request
Related items