Research And Application Of Optimization Algorithm Of WEB Data Mining Based On XML

Posted on:2016-01-14

Degree:Master

Type:Thesis

Country:China

Candidate:L D Zhang

Full Text:PDF

GTID:2308330473952383

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The Internet has become the effective way of modern people and generally necessary to obtain information, but in the vast sea of the Internet such as wave to extract the required information like look for a needle in the ocean, because the data size, variety, therefore, how to help people on the Internet is the valuable information has become the most meaningful research direction and hot topic. XML has become the standard of data conversion in mobile internet. In the mobile Internet has a lot of XML document management XML data emerge, how to effectively and timely and mining useful information, become the focus of attention of the mobile Internet industry.This paper briefly introduces the theory foundation for the construction of XML data storage and query system of WEB in data mining, namely XML technology, data mining algorithm. On this basis, this paper focuses on the analysis of the classic APRIORI algorithm, summarized the main disadvantage of this algorithm is proposed and the feasibility of the solution. One is to reduce the number of candidate itemsets computing support database tuples, improve APRIORI algorithm generates frequent itemsets efficiency; two is the use of compression set of rules, APRIORI association rules pruning strategy as well as the optimization of the generation method, the objective is to narrow the range of frequent itemsets to generate strong association rules is required to judge. Three is to accelerate the data between the query and storage efficiency. According to the characteristics of the path expression as the main body of the XML query, presents a method of storing XML documents in relational database, this method is based on the XPath data model, and the elements in the XML document Dietz coding to identify elements, at the same time in the database in the Dietz code to store each element and its parent element to maintain elements. The relationship between father and son for the relational data into XML documents or document fragments. Using this method, we developed a storage, conversion and query of three modules of middleware, which are used to store XML document elements, attributes and text.Finally, the APRIORI improved algorithm is applied to the "XML data storage and query system". The improved APRIORI algorithm improves the query speed, and the time complexity has obvious advantages. The experimental results show that, the improved APRIORI algorithm improves the quality of the strong association rules, reduces the computation time consumption, the improved APRIORI algorithm can more effectively improve the query and data storage effect..

Keywords/Search Tags:

XML, Xpath, APRIORI algorithm, association rules, Dietz code

PDF Full Text Request

Related items

1	The Research And Implementation Of Association Rules Algorithms-Apriori Based On Cloud Computing
2	Research And Improvement Of Apriori Algorithm In Association Rules
3	Research And Application Of Apriori Mining Algorithm Based On Association Rules In Colleges Management System
4	Research On Improving Apriori Algorithm For Mining Association Rules
5	Model Design, Based On The Apriori Algorithm And Olap Association Rules Mining
6	Research On The Key Technology Of Information Association Based On Data Mining
7	Research And Improvement Based On Apriori Algorithm And Its Application In Wisdom Endowment
8	Research On Incremental Updating Association Rules Mining Based On Apriori Algorithm
9	Research And Application Of Apriori Algorithm In Association Rules
10	Research On Association Rules Algorithm In Data Mining