Data Mining Algorithm Analysis And Improvement Based On Frequently Pattern

Posted on:2008-07-03

Degree:Master

Type:Thesis

Country:China

Candidate:J Jia

Full Text:PDF

GTID:2178360242458820

Subject:Computer applications

Abstract/Summary:

PDF Full Text Request

In ages of data exploration, Data Mining plays a more and more important role. Recently, many data flow applications such as Network Log, Sensor Network and so on come out, and the mining of data flow has become a new research field and a useful tool in many application fields. The traditional Data Mining methods cannot effectively process the new model because of the infinite data, the frequent change of data and the more structural type varieties.To find the most frequence items set in Data Mining is an important problem, a basic research field, and the basis for many Data Mining methods. In its application, users need adjust the least support degree in order to find the more useful the most frequence items set.In many existed algorithms about the mining of frequence pattern, the Apriori algorithm and the FP_TREE is typical of them. The main characteristic of Apriori algorithm is its mining mechanism from the single item and trims a point every time. Employing these features, Apriori can effectively avoid the search of many impossible items. However, one problem of Apriori algorithm is its candidate items set generation. Another algorithm FP_TREE uses the divide and conquer method, compresses the information in dataset into a frequent pattern tree describing the frequent item information, and iterates to increase the auxiliary model for increase frequence pattern and database partition.Here, we make some improvements in FP_TREE algorithm and compare it with other algorithms. This algorithm saves the categorization information in data when compressing the data through the frequent model classification in data flow. The experiment implicates that this algorithm has much more accuracy than others, and also can better process the applications including many default value in training set.In sum, confronted with the data mining in large data sets, the improved FP_TREE can better solve the fusion problem of large data sets through applying the frequent model Data Mining. Moreover, not only the accuracy and the running time of data classification have experienced an improvement, but the window size also experiences a better control because of the introduction of window mechanism.

Keywords/Search Tags:

Data mining, Stream Data, Frequence Pattern, FT_TREE Apriori

PDF Full Text Request

Related items

1	Research On An Application Of Data Stream Query And Data Stream Mining In Oil Field
2	Algorithm Data Stream Frequent Pattern Mining
3	Pattern Mining Algorithms Over Data Streams
4	The Study On Frequent Patterns Mining And Data Predicting Over Data Streams
5	Study On Data Stream Techniques And Its Application In Electric Power Information Processing
6	Research And Application Of Frequent-pattern Mining Methods In Data Stream
7	A Mining Method Based On Utility Pattern From Data Stream
8	Frequent Pattern Mining Algorithm Research For Data Stream
9	Research On Similarity Query And Pattern Mining Algorithms Over Data Stream
10	Research Of Frequent Pattern Mining System On Data Stream