Font Size: a A A

Research On And Implementation Of Frequent Item Set Mining System In Data Stream

Posted on:2008-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:L J ChenFull Text:PDF
GTID:2178360212984991Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet, the world has entered into the age of information-economy. Whereas, what user desire is not more information, but the solution of finding hiding information and the method to find more high level knowledge from the information. We may now refer to search engines, but the performance of the search engines has been restricted by the tremendous growing speed of the information and the user number of the search engines.There are several methods to improve the performance of search engines. For example, clustering, various kinds of vertical search engines, optimization for ranking, personalized search and so on. We hope to develop a better frequent item set mining algorithm in data stream as a kind of information aggregation method.In this paper, we proposed a novel algorithm for this system, which is more suitable for current applications, with the name of Lattice Lossy Counting based on Lossy Counting. By the use of lattices, the mining algorithm can give results which are time sensible. By dividing the algorithm into two phases, the efficiency of the algorithm is accelerated, but keeping the original precision.We also developed a novel frequent item set mining system in data stream named Fenster. It receives the stream of click transaction or key-word transaction as input, and then mines frequent item set in the data stream and gives immediate results. We introduce two different architectures for the variety applied circumstance. The compressed one is for integrated applications and the distributed one is for enterprise applications.The experiments prove that our algorithm outperforms the classical algorithm both in the time and space complexity. The two systems can also work well. Meanwhile, they can be applied in different areas. By configured in different ways, we can use them in financial analysis, business analysis and weather forecast and environmental monitoring.The main work here is to introduce the improvement of the algorithm and the implement of the frequent itemset mining system. At last, we give the results of our experiments...
Keywords/Search Tags:data stream, frequent item set, data mining, search engine
PDF Full Text Request
Related items