Font Size: a A A

Scientific Data Mining System Of Classification And Clustering Applications

Posted on:2007-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiFull Text:PDF
GTID:2208360185956511Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
It is very difficult to understand or analyze the large-scale data in many scientific domains, because it has the characteristic of large-scale quantities, complex features when dealing with, and it is more difficult to get knowledge from it. For these data scientist are not satisfied with the traditional methods of queries and statistics, but want to find deeper regulations to provide effective decision to science and research works. So it is absolutely necessarily to do science data mining (SDM) now.The main point of this paper is to research the theories and applications of classifying and clustering which is suitable for large-scale science data mining. Especially, our research focuses on: classifying and clustering base on decision_tree, BP neural network, grid, density, partition. The model for special application--Molecular Dynamics numerical value simulation, especially aim at classifying, clustering and pattern recognition for scientific dataset. The paper puts forward the clustering algorithm includes: clustering based on Grid and Iterative, Enhanced clustering algorithm base on density and k-medoids, Enhanced k-means algorithm (optimize chooseing consult_points in iterative process), Enhanced clustering algorithm base on distance. They can overcome many limitations (Some traditional algorithms terminate in local optimization. Many results of cluster are roundness, too many times in partition iterative process), which are related to the static architecture of traditional model. It can benefit the high dimension data analysis greatly. So, these methods can promote the research of large-scale pattern recognition greatly. Combine many traditional algorithms we construct the utility SDM system on J2EE platform and we validate the correctness and efficiency of our algorithms. This SDM system gives a new method to mine valuable information from large-scale numerical value simulation data.
Keywords/Search Tags:classify, clustering, Data Mining, Knowledge Discovery, Pattern Recognition, Grid, density, decision_tree
PDF Full Text Request
Related items