Font Size: a A A

The Improvement Of Two Clustering Methods And Its Application Research

Posted on:2024-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y SuFull Text:PDF
GTID:2568307121484684Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Clustering is an unsupervised learning method that explores the cluster labels of data and then reveals the intrinsic structure of the data based on the cluster labels.Traditional clustering methods,including K-Means clustering,Spectral Clustering(SPC)and subspace clustering etc.,have been widely used in the field of data science and usually have good data preprocessing results.Among them,spectral clustering is a clustering method based on graph partitioning.Compared with other traditional clustering methods,spectral clustering method is more suitable for clustering manifold-like data,and it is easy to converge to the global optimal solution.However,to solve the clustering problem in various situations,the methods proposed so far are not perfect.Based on spectral clustering and subspace clustering,this paper proposes two improved clustering methods: 1)balanced spectral clustering;2)low-rank representation subspace clustering.At the same time,the balanced spectral clustering method is applied to the field of seismic fragility analysis,and a reasonable subset of seismic wave samples is screened out,which greatly reduces the amount of calculation subsequently.Specifically,the main innovations of this paper can be summarized as follows:1.A Balanced Spectral Clustering(BSPC)method is proposed,which improves the original spectral clustering.When faced with the problem of class imbalance where the number of sample points in each cluster is very different,the original SPC algorithm will no longer be applicable.For this reason,this paper proposes the new BSPC algorithm based on SPC.BSPC introduces approximately orthogonal constraints to the cluster membership matrix,which not only reduces the cluster membership of large-size-cluster’s samples,but also increases the cluster membership of small-size-cluster’s samples,thereby improving the separation of clustering classes(or clusters).The experimental results show that BSPC can alleviate the uniform effect of the original SPC on the unbalanced data set and improve the clustering purity.2.A Subspace Clustering Based on Low-Rank Representation(LRRSC)method is proposed,which improves the original subspace clustering.Existing subspace clustering directly treats the original dataset as dictionary,which is not representative.Based on the low-rank representation of data points,this paper proposes a new subspace clustering method LRRSC.LRRSC uses nuclear norms to select data features,while keeping the original data structure as much as possible,so that feature selection can be performed better.Experimental results show that LRRSC can effectively improve the accuracy of clustering.3.Balanced spectral clustering(BSPC)is applied to seismic fragility engineering.Aiming at the problem that the entire sample is often used in seismic fragility analysis,which leads to excessive calculation,this paper uses the proposed BSPC algorithm to automatically screen typical seismic samples,thereby reducing the subsequent calculation.Taking practical structural engineering as an example,this paper studies the seismic fragility of the entire "arch dam-foundation structure",and establishes seismic fragility models based on 15 code response spectrum samples and 109 overall samples respectively in order to compare the resulted effects of them.The results show that the seismic fragility result based on the samples screened by BSPC algorithm in this paper are closer to the result based on the whole sample.Under the sampled performance indicators,the maximum errors of the fragility probabilities of the two are 4.39%,3.84%,and 6.64%,respectively,and the minimum probabilities of the errors not exceeding 5% are 92.24%,99.19%,and 81.75%,respectively.This shows that the BSPC algorithm is Effectiveness in screening typical seismic samples.
Keywords/Search Tags:Spectral clustering, Subspace, Cluster membership matrix, Orthogonal constraint, Low-rank representation, Seismic fragility analysis
PDF Full Text Request
Related items