Font Size: a A A

The Research Of Clustering Analysis Based On Synchronization Theory For Large Scale Dataset And Its Application

Posted on:2016-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:J H MaFull Text:PDF
GTID:2308330473955930Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Recently, with the rapid development of data storage technology, a large amount of valuable data is accumulated. Neverthless how to effectively utilize the "sunk data" to support decision-making even for promoting economic and social development is becoming a problem. Data mining technology is an effective way to solve this problem, which is able to obtain useful information and knowledge hidden in data via a variety of analytical methods, such as prediction and regression, association rules mining, classification, clustering, and so on. Among these methods, clustering analysis as a basic data mining method is not only applied for data compression through acquiring the patterns hidden in data, but also widely used in many fields, such as customer division, population classification of plants and animals, geographical data analysis. Therefore, we analyze art of the state of clustering analysis, and study the clustering algorithm based on synchronization theory for large-scale dataset from the perspective of complex networks and its applications. The main works are as follows:1. Integrated with the basic workflow of clustering analysis, the common similarity metrics and the evaluation methods in clustering analysis are introduced in detail. And according to the different basic idea of clustering algorithms, they are divided into four types: division-based, hierarchical-based, density-based, model-based, and some representative algorithms are summarized. Then we analyze their application scenarios and workflow. Lastly, the clustering algorithm based on synchronization theory is studied and its advantage is given.2. The community detection of financial network based on synchronization theory is studied. The correlation matrix is calculated by utilizing the similarity of time series of stock price fluctuation. And then through the spectral analysis on the correlation matrix, it’s determined that the complicated community structure obviously exists in the stock network. So the synchronized clustering algorithm is used to dynamically identify the community structure, and the local order parameter is introduced to determine the algorithm converges or not to successfully obtain the underlying substructure of the data. Moreover, we also apply the fast community detecting algorithm to verify the former results. It’s found that the result given by the synchronized clustering algorithm is not only the correct partition of the stock set, but also is consistent with the stock taxonomy.3. The functional connectivity of brain network derived from the high-resolution EEG time series during the visual task is analyzed by a data-driven approach based on synchronization theory. In this study, integrated with the characteristics of electro encephalogram(i.e EEG), the original data is preprocessed. And the Symbolic Aggregate approXimation(i.e SAX) algorithm is employed to measure the similarity. Then, a synchronized-based, data-driven clustering approach is used to obtain the partition result of cerebral cortical areas in order to study functional connectivity from the perspective of the cortical correlation. What’s more, we compare the result with the anatomical parcellation of the brain, which is complied with Brodmann segmentation scheme. It’s found that the algorithm based on synchronization theory can not only accurately reveal the cortex involved in five-box experiment, but also gives the functional connectivity of brain network during visual task.
Keywords/Search Tags:clustering analysis, complex theory, synchronization theory, stock taxonomy, brain functional connectivity
PDF Full Text Request
Related items