Font Size: a A A

The Survey On The Semantic Similarity Of XML Documents

Posted on:2010-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:C L WangFull Text:PDF
GTID:2178360302962060Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the development of information technology, XML is increasingly becoming a hot topic. To the development of XML, it brings new hope to the information interchange which is based on Web. But the XML document is semi-structured , in searching and dealing with these semi-structured data information,especially in the user needs to find information relevant to a particular data will have a lot of problem. So, approximate searching technique is needed. The foundation of approximate searching technique is how we can compute the similarity and relevance between information and documents. So it is important to study the similarity between XML documents.The similarity between XML documents is the foundation to the document search, mining and document clustering. It is the central topic in information retrieval and data storage areas. This article reviews the research status of XML documents similarity. We analyze the application of XML documents similarity in data integration, data warehouses and document clustering.We give an introduction to the syntax structure of the XML. The syntax structure of XML is the foundation of application and processing of XML documents. Then, we introduce the concept of the Semantic Web and trees. Focus on the current XML document similarity calculation method is summarized. The current XML document similarity calculation method can be divided into Edit-distance based, Information-Retrieval based, edge matching, a collection of measurement, pattern matching and SIC. This paper focus on the above six kinds of methods, and shows their application in different areas and methods of shortage. We prospect the XML document similarity research questions. Finally,we make the conclusions .
Keywords/Search Tags:XML Documents, semantic, semantic similarity, edit-distance, edge matching
PDF Full Text Request
Related items