Research On Web Text Mining Based On XML And Association Rule Mining Algorithm

Posted on:2012-11-12

Degree:Master

Type:Thesis

Country:China

Candidate:Y Wang

Full Text:PDF

GTID:2178330338994854

Subject:Computer application technology

Abstract/Summary:

In recent years, with the development of computer technology and the popularity of the Internet, the data quantity in all levels of website server is getting more and more huge, the data type is also getting more and more numerous and diverse, how to use these data more effectively and dig out valuable information in all areas now become a hotspot research.Although traditional database technology and data mining technology has acquired rapid development and also consummates day by day, but because the data type of Web data is semi-structured or unstructured, traditional technology have many difficulties in mining information of Web data. XML is a semi-structured data model, with the continuous development of XML, more and more Internet information are indicated by using XML. XML have the Characteristics of extendibility, platform independency, flexibility and so on, also has strong data expression skills, which make XML have stronger role in representing and exchanging information day after day. Therefore, regarding the huge quantity of XML data, how to effectively extract valuable information is imminent.The Apriori algorithm is a classical algorithm for mining association rules and has great influence in association rules domain, however, as a result of its need to scan database frequently and the large space consumption, many people have made the improvement with it through many kinds of methods. Existing Apriori algorithms realized by the XQuery language still have the place needs to be improved, for example, in certain circumstances, because of the XML documents'large data quantity, the related data is stored in many documents which have no inevitable relation. But the present association rule mining algorithms are mainly mining the single XML document, the algorithms must be improved if they mining several documents.This article unifies XQuery which is XML's query language and the association rule mining algorithm to realize the Apriori algorithm based on XQuery as to study mining association rules of several XML documents. It makes the improvement to the algorithm through introducing the collection which belongs to the XQuery language and has the characteristics of accessing sereral XML documents, which realizes the aim of mining several XML documents on the premise without reducing the efficiency of mining. The improved algorithms will be used in Web text mining model based on XML and its feasibility and validity will be verified.

Keywords/Search Tags:

XQuery, Apriori, XML documents, association rules, data mining

Related items

1	The Research Of Association Datamining Based On XML And Web Data
2	Research On The Apriori Algorithms For Meteorological Data Association Rules Analysis Based On Cloud Computing
3	The Research And Implementation Of Association Rules Algorithms-Apriori Based On Cloud Computing
4	Research And Application Of Apriori Mining Algorithm Based On Association Rules In Colleges Management System
5	Research On Incremental Updating Association Rules Mining Based On Apriori Algorithm
6	Study On Associations Rules's Apriori Algorithm In Data Mining
7	Association Rules Mining And Its Applications In Microarray Gene Expression Data
8	Algorithm Based On Association Rules In Data Mining Research And Application
9	Research And Implementation Of Web Log Mining Based On Asociation Rules Apriori Algorithm
10	Research On Application Of Association Rules Mining Algorithm In Web Log Mining