Font Size: a A A

The Research Of Association Datamining Based On XML And Web Data

Posted on:2009-10-31Degree:MasterType:Thesis
Country:ChinaCandidate:C J CaoFull Text:PDF
GTID:2178360242966434Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the Recent years, because of the fast development of Internet, more and more data has been generated on the Web. How to make great use of the new knowledge and improve the utility of the information becomes a big challenge.Along with the improvement of the XML technology, more and more Web data began to be represented using XML. XML is playing an increasingly important role in the exchange and represent of a wide variety of data on the Web and elsewhere due to its expansibility, platform-independence, flexibility, simpleness, standardization and powerful ability for representing data. So, there have been increasing demands for efficient methods that can extract rules and patterns from XML data. However, the XML data on the Web is too complex and semi-structured with no certain description pattern. Thus, we cannot directly apply to XML data with the traditional data mining methods for relational databases. Hence, it is a great challenge to develop efficient and scalable methods for XML data mining.Based on the characteristics of the XML data itself and the XML operation support of XQuery and .NET DOM, we have done a great and deep research on how to extract meaningful association rules from XML data directly.Firstly, we improved the XQuery algorithm to solve the two limitations that it can't mine complex and irregular XML data and large sets of XML data.The experimental result verified that our improvement could efficiently extract association rules from XML data. Secondly, we discussed how to mine XML data using the algorithm implemented by .NET DOM.The .NET DOM uses object oriented mechanism to operate the XML data, more similar to human thinking, easier to understand. Moreover, the algorithm implemented by .NET DOM is more powerful for view and faster running with compilation. Thirdly, we compared the above two methods for mining XML data by madding an experiment with the XML data extracted from Web. We found that they both have advantages and disadvantages in different mining environment. Lastly, we proposed a five level framework model for mining association rules from XML data and every function module of the model is described in details. Then, based on the model, we designed an association rules mining system of the e-Commerce web site. This system is able to process different kinds of input data, support common association rules mining algorithms and the visual expression of mining results. It also has a good integration and extensibility.
Keywords/Search Tags:Association rules, XML, XQuery, Apriori alogrithm, .NET DOM, Web Data Mining
PDF Full Text Request
Related items