Font Size: a A A

Research On Technology Of Mining Association Rules In Data Streams

Posted on:2013-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y H TangFull Text:PDF
GTID:2248330371478095Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the modern information society,data that should be dealt with in every field is more and more,.But data analysis technology is deficient.This leads to the wasting of time and space resources.Then data mining technology appears,This technology can find hidden and potential knowledge in the huge data set.In data mining technology,Association Rule Mining is an important filed which is committed to find relevant information among the data.Different algorithms have different ideas but they are all divided into two steps-mining frequent itemsets and finding the association rules. So how to develop the efficiency of the two steps is the key question in this filed.In recent years,with the development of the research in this filed,many classic algorithms has been put forward.But with the rapid increment of information,the model of data often comes continuously,fast and time sensitively. That is the data lives in data streams form.This new data model brings many challenges for the traditional algorithms. FP-Stream algorithm is a classic algorithm for data streams.It incrementally maintain titled-time windows for each pattern at multiple time granularities.Interesting query can be meet under this framework.Although this algorithm has high efficiency in time and apace,it does not compress the data itself. It couldn’t be used to high speed environment.On the other hand,time-titled model also consume memory.In a word,there is still contradiction between limited memory and huge data in FP-Stream. We develop this algorithm by introduce Dif-bits compress algorithm to compress the initial data streams.Meanwhile,Binary chart transformation is brought in the time-titled window.According to this two development measures,memory used can be decreased.We introduce new metrics like lift,cosine,interest to develop the min-support and confidence framework to improve the accuracy of the consequence.In a word,according to improving the two steps of the algorithm,we can improve the algprithm’s ability to deal with the data streams.Then it can be used to more fields.
Keywords/Search Tags:data streams, association rule, frequent itemset, vertical-FP-Stream, Dif-bits, binary chart, metrics
PDF Full Text Request
Related items