Font Size: a A A

Affordable distributed data mining

Posted on:2005-08-23Degree:Ph.DType:Dissertation
University:University of WyomingCandidate:Sung, Chang OanFull Text:PDF
GTID:1458390008980342Subject:Computer Science
Abstract/Summary:
While data mining has existed for decades, it has not seen broad application outside of scientific and business arenas, primarily because it requires significant computing resources and complex, customized software. One way in which this technology can be made more generally useful is applying it on networks of heterogeneous desktop computers. Through the use of distributed computing techniques, data mining can be made more "affordable" for applications where data tends to be distributed in non-prescribed formats among users who are only loosely connected to an enterprise or organization.; In this dissertation, we describe two components of affordable distributed data mining, namely adaptability and reliability, and an approach to developing such a system on a peer-to-peer network of desktop computers. Our approach is more generally applicable than previous ones and takes network failures, a problem in peer-to-peer networks, into account.{09}Our approach is based on virtual data warehousing, which can produce reliable results without the need to create a centralized data store, and a task distribution algorithm that allows idle processors to efficiently perform data mining tasks. A small network implementation is used to evaluate the approach's utility.
Keywords/Search Tags:Data mining, Distributed, Affordable
Related items