Font Size: a A A

Molecular Conformation Clustering Based On Improved FDP

Posted on:2022-05-26Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhouFull Text:PDF
GTID:2480306494956599Subject:Theoretical Physics
Abstract/Summary:PDF Full Text Request
The molecular dynamics simulate the changing states of molecules with the computer based on the physical law.With the rapid development of computer hardware,simulation time of microseconds can already be achieved for large-scale molecules.After long-term simulation,massive molecular conformations data are generated.It is not practical to analyze the conformations from all the frames in simulation trajectory.It is necessary to cluster the molecular conformation with the same property by the clustering method and analyze the typical conformation of each cluster.In 2014,Rodriguez and Laio proposed the Find Density Peaks(FDP)clustering algorithm.It is a density-based clustering method.The FDP not only considers the information of the point density but also the mutual positional relationship between the points.The algorithm can separate clusters with different shapes,and it is insensitive to noise data points.Because of the clustering mechanism's advantages,FDP has been used in many fields to solve the practical clustering problem.However,there are still several problems with the FDP clustering algorithm,such as consuming too much computing resources in calculation.After clustering,the halo points are considered as a whole group set of points,which leads to the part between two clusters can not be described accurately.For these limitations,this paper proposes an improved FDP clustering algorithm.First,the improved algorithm uses the K-means clustering algorithm and other methods to preprocess the data.And then,it obtains the typical points and their weights.By introducing the weight of points into the FDP,the computational complexity of the FDP algorithm is reduced,and the cluster is implemented.Meanwhile,the improved FDP algorithm redefines the halo points of the original FDP clustering algorithm and introduces the definition of boundary points to explain the intersection region between clusters.The time complexity and space complexity of the improved FDP algorithm is significantly reduced.This paper applied a set of 2D data sets to verify the clustering ability of the improved FDP algorithm.Compared with the original FDP,distance-based K-means and density-based DBSCAN,clustering results show the advantages of the improved FDP clustering algorithm.Compared with the original FDP,the improved FDP not only has faster computing speed and less space consuming but also can separate the boundary parts of each cluster and assign the halo parts to each cluster to form the halo partsIn this paper,the improved FDP is applied to cluster the molecular conformations obtained from molecular dynamics simulation.The clustering of 20 amino acid units and 3 proteins were implemented from two different scales of dihedral angle and secondary structure.The experimental results show that the improved FDP can not only cluster correctly,but also get the transition conformations between each cluster conformations.The computational complexity and clustering accuracy of the improved FDP are better than those of K-means and DBSCAN.It lays a good foundation for subsequent conformational analysis.
Keywords/Search Tags:Molecular dynamics simulation, K-means, DBSCAN, FDP, improved FDP
PDF Full Text Request
Related items