Font Size: a A A

Research On The Platform For Clustering Graph Data And Its Implementation

Posted on:2013-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:W J ChenFull Text:PDF
GTID:2218330371455916Subject:Computer software and applications
Abstract/Summary:PDF Full Text Request
As a common data structure, graph is composed of edges and vertexes and can express rich semantic information. In recent years, graph data mining is becoming a hot research field of data mining. Graph data cluster is the key technology in graph data Ming, and the aim of using graph data Clustering algorithm is to find those well-connected sub graphs in a large graph, and the vertexes in those sub graphs are strongly connected, but the vertexes between sub graphs are connected weakly. Graph data cluster has become an important technology in community finding in large-scale complex networks, and has been widely used in many applications like chemical compound structure, biological information, machine vision, video indexing, text retrieval and Web Analytics.In this paper, we focus on the needs of graph-based data clustering applications, and study some graph data algorithms, and design and implement a platform for clustering graph data with SSH(Struts-Spring-Hibernate) framework. The platform integrates a variety of classic graph data Clustering algorithms. The parameters of the algorithms can be modified according to the researcher's demand, so that there is a unified standard of input and output data, visualization of clustering results, and provide scalable interface, so that the deployment of the new algorithms on the platform is easy. And provide an open, portable, scalable platform to the graph-based data mining algorithm researchers. Based on the above research objectives, this paper includes:1. First, we introduce some related knowledge, which is useful for implementing the platform, such as data mining, graph-based data mining, Eclipse, MVC and SSH architecture.2. Then, based on the classic Locality Sensitive Hashing algorithm, we consider the background of big graph data, and propose a graph-based data clustering algorithm:G-LSH, which is based on Locality Sensitive Hashing. And ws describe the details of the basic idea of the algorithm and the specific design of it.3. Then, we describe the design and implementation of the graph data clustering platform. Specifically, we propose a general framework design and describe the detail of the interface of the platform to improve the scalability of the platform, and make it easy to use for the researchers. Than we present the implementation of the platform, and discuss the details of how implement the platform, including data persistence layer and algorithm library and visualization of the result cluster algorithm.4. At last, we check out the G-LSH algorithm's performance in this graph data clustering platform using the biological data set,.At the end of this article, we did a summary of related work, and described future work prospects.
Keywords/Search Tags:graph-based data, cluster analysis, Eclipse, SSH, G-LSH
PDF Full Text Request
Related items