Font Size: a A A

Research Of Data Mining Techniques For XML Documents

Posted on:2008-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:D X MeiFull Text:PDF
GTID:2178360215980824Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data is stored as XML style, a lot of knowledge and all kinds of patterns can be extracted from the data. So XML data mining come into being. Research of Data Mining Technology for XML Documents includes data mining on structures and contents of XML.In the association rule mining based on single XML document, through analyzing the XML document, the valuable data is usually those data or data type of high frequency. So the primary task is to find these data. Because XML can be seen as a hierarchy tree, the data is leaf node of tree. The data must be got from a root to a leaf along a path. Hence we can consider mining data from the path of XML.In the research of data mining based on hierarchy tree and XML, primary purpose is to find out the frequent sub-tree and the interesting relationships among the nodes of sub-tree. So, it must satisfy two conditions. One is that it must exceed a threshold; the other is that the path is association with the task of mining. The hierarchy tree of interesting data must be used in deciding the frequent interesting path. If the data of a path can not been found in the abstractor definition, then it is not generalized and is pruned. The basic principal of hierarchy tree has two steps. Firstly the value of attribute must be replaced by the farther conception of the hierarchy tree. Secondly, the same sub-tree must been merged. If the number of sub-tree of XML is more and more, the value must be replaced by the abstracter conception.
Keywords/Search Tags:Data Mining, XML, Label Path, Conception hierarchy Tree
PDF Full Text Request
Related items