Font Size: a A A

Independent Component Analysis And Its Scientific Data Mining

Posted on:2008-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:J A HeFull Text:PDF
GTID:2208360215950022Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The innovation of technologies has enable scientists to collect huge data from experiments, simulations, and observations at an increasing pace. The volume of data has reached from megabit and gigabit to terabit. The ability of generating data has far surpassed our capability of research, analysis, and understanding.Data Ming technology is an interdisciplinary field that combines and integrates database theories, statistics, machine studying theories, and neural network methods. It is used to search for positive, novel, internally valuable, and explainable patterns. However, vast scientific data not only have a great scale, but also possess complex properties and high dimensions, challenging traditional techniques of data mining. Therefore, the development of novel, effective scientific data mining technology and the increase in the interaction with large-scale, high-dimensional, time-sensitive scientific data has significant research and practical meaning.Independent Component Analysis (ICA) is a type of new signal process technology, which has caught wild attention in the international academic arena and also become a power tool in analyses of high-dimensional data. The fundamental idea of ICA is, by analyzing higher-order statistic correlation among multidimensional data, to sort out implicit information and complete the elimination of higher order residuals and the abstraction of independent information sources. This property leads to a promising prospect of ICA in applied fields such as abstraction of image characters, compressing, and pattern recognition.This dissertation introduces the use of ICA technique in data mining that deals with large-scale simulative scientific data. The method not only effectively simplifies calculations and reduces difficulties involved in mining process but also establishes an inner relationship between the raw data and the actual physical process. The following tasks are to be completed in the analysis:1. To describe the characteristics of scientific data and the basic research methods. To provide a detailed introduction of data mining techniques and the fundamentals of data mining systems.2. To introduce the basics of Independent Component Analysis (ICA), including theories in fields of statistics and information technology.3. To offer a thorough description of the theories of Principle Component Analysis(PCA) and ICA, followed by a discussion of their connections and differences.4. To closely analyze the HDF5 format of scientific data and apply ICA technique to empirical HDF5 data generated from simulation programs. By effectively reducing the dimensions of scientific data, the method accurately abstracts the characteristics of the physical process corresponding to the scientific data and explicitly expresses the results of mining.5. To introduce Electron Cyclotron Resonance (ECR), apply ICA technique to its empirical data mining, and obtain results similar to 4. This step further demonstrates that ICA technique can be used to find points of interest, which have high internal values in scientific data, and increase the efficiency and accuracy of data analysis, suggesting the prospect of ICA in the area of large-scale scientific data mining.
Keywords/Search Tags:independent component analysis, principle component analysis, data mining, scientific data
PDF Full Text Request
Related items