Font Size: a A A

Parallel Data Mining For Medical Diagnostic Aid Based On Multi-Core CPU System Architecture

Posted on:2009-11-13Degree:MasterType:Thesis
Country:ChinaCandidate:L ChenFull Text:PDF
GTID:2178360242981481Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data Mining is a tool which can be used for extracting some uncertain but potentially useful information and knowledge from a lot of data which may be incomplete, noisy, fuzzy and random.It is a new model used for data extracting and refinement. Data Mining is widely used in economics, industry, agricultural and military, including medical as well. It s a kind of powerful new technology, which can predict the trends and behavior in the future, and can assist decision-making. In short it can help people acquire knowledge. Hence it was called the "Knowledge Discovery".Knowledge Discovery is a component of knowledge engineering. Data mining makes the methods and processes of extracting knowledge in some field as its research direction. Therefore, data mining technology plays a very important role in solving various sub-topics of knowledge engineering. The medical diagnosis is the subjective judgment based on professional knowledge of some field, and the diagnosis is a reasoning process based on some rules. Doctors gain experience and knowledge through a certain way, and then develop a reasoning network in their minds; the data in the case is stored in the database as a static text.This paper discusses about Knowledge Representation Systems in Intelligent Medical Diagnosis System, constructs the arithmetic based on the knowledge representation reduction of the decision tables, and forms the reduction of decision representation from Rough Sets. Then, an example is given to analyze the process of knowledge reduction and a Knowledge Base is produced based on data analysis and deduction. The example is coronary heart disease patients'cases, which present some amazing medical knowledge. By analyzing the intelligent decision procedure based on Rough Sets and comparing with the parameters obtained by Medical Diagnosis system, unimportant symptoms of angina, one of the sub coronary heart diseases, are extracted, while those symptoms are dominating symptoms for majority coronary heart disease. Finally the whole data mining arithmetic is given.The speed of computer development is amazing. Nowadays, CPU speeds are beyond 2G HZ, and the performance of transistors has been close to the physical limit. People have to accept the fact that the age of parallel computing based on multi-processor is coming .After Millennium, a chip integrating many separate cores appeared in USA.To program on the Parallel Computing System environment is different from the traditional serial programming. The former requires ether artificial coordination or directing by the compiler among the threads running on the cores, which are ether integrated in one chip or distributed in several computers which are far away from each other and connected by the network. Therefore, parallel programming is more difficult. There was one view in the field of parallel computing that "Parallel computing is the tusk in the future and will always be." This declaration has been right for several decades. With parallel computing being routine, the role of Rough Sets Theory combined with Parallel Computing may be an existence.The premise of Parallel Programming is to have in-depth understanding about the computer architectures, which include a number of cores. The programming rules of a same computing task on different architectures may be different. The concept, Computer System Architecture, is founded by Amdahl, who declared that, attributes about conceptual structure and functional characteristics are from the programmers'point. Thus, improving computer's performance is goal of the researching in computer architectures. How to parallel computing is very important.The so-called parallelism is at the same time or within the same time interval to complete two or more work which of the same class or not. As long as time overlap, there is parallelism. Strictly speaking, the parallelism which has two or more events take places at the same time called simultaneity, and the parallelism which has two or more events take places in the same time interval called concurrency. At first, we design data mining program based on Rough sets by the traditional way, and then based on the above two kinds of ideas, the traditional single-core with single-threaded serial data mining is developed into a single-core with multi-threaded and multi-core with multi-threaded data mining program.Having done research data mining on the system of single-core CPU system with multi-threaded, we found that some threads was associated with the orders. There is of no use to distribute them fighting each other for the CPU resources at random. Thus, we try to call the user-level threads from the Windows system functions Library, doing some artificial intervention, which is called Fiber threads Programming. After hard debugging, we were satisfied with the performance of the program. However, the performance was not doubled with the of the number of cores increasing. Further research work should be done so as to speed up the procedure.After tests, we chose both the architecture ,AMD Athlon 64 X2 Dual-Core Processor TK-53 1.70GHz with 512MB memory, and that of Intel ? core ? 2 Duo CPU t7100@1.80GHz with 2GB memory. Choosing the software environment, we took Microsoft Windows XP SP2 operating system and Visual Studio.net 2005 compiler, Intel C9.2 compiler and VTune10.1 thread controller. Parallel algorithm was based on modified multi-fiber data-mining program with parallel compiler and OpenMP2.0 STL generic interface standards. After tests again, parallel program based on multi-core environment works well. Both the hardware and the software environment have been on the top level.During the experiments, we made sure of that the combine of the two standards OpenMP2.0 and STL works incompatibly, which is unsafe for parallel programming with STL.
Keywords/Search Tags:Architecture
PDF Full Text Request
Related items