Font Size: a A A

Analysis, characterization and design of data mining applications and applications to computer architecture

Posted on:2010-06-29Degree:Ph.DType:Dissertation
University:Northwestern UniversityCandidate:Ozisikyilmaz, BerkinFull Text:PDF
GTID:1448390002977616Subject:Statistics
Abstract/Summary:
Data mining is the process of automatically finding implicit, previously unknown, and potentially useful information from large volumes of data. Data mining algorithms have become vital to researchers in science, medicine, business, and security domains. Recent advances in data extraction techniques have resulted in tremendous increase in the input data size of data mining applications. Data mining systems, on the other hand, have been unable to maintain the same rate of growth. Therefore, there is an increasing need to understand the bottlenecks associated with the execution of these applications in modern architectures.;In our work, we present MineBench, a publicly available benchmark suite containing fifteen representative data mining applications belonging to various categories. First, we highlight the uniqueness of data mining applications. Subsequently, we evaluate the MineBench applications on an 8-way shared memory (SMP) machine and analyze important performance characteristics. Our results show that data mining workloads are quite different than those of other common workloads. Therefore, there is a need to specifically address the limitations of accelerating them. We propose some initial designs and results for accelerating them using programmable hardware.;After the analysis of the data mining applications, we have started using them to solve some of the computer architecture problems. In a study, we have used linear regression and neural network models in the area of design space exploration area. Design space exploration is a tedious, complex and time consuming task of determining the optimal solution to a problem. Our methodology relies on extracting the performance of a small fraction of the machines to create a model and use it to predict the performance of any machine. We have also shown using a subset of the processors available for purchase; we can create a very accurate model presenting the relation between the processor properties and its price. In another study, we try to achieve the ultimate goal of computer system design, i.e. satisfy the end-users, using data mining methods. We aim at leveraging the variation in user expectations and satisfaction relative to the actual hardware performance to develop more efficient architectures that are customized to end-users.
Keywords/Search Tags:Data mining, Computer, Performance
Related items