| Spectral clustering is an unsupervised clustering algorithm based on graph theory.It can partition data effectively by converting data set into undirected graph.It has good clustering performance for non-convex data set and sample space of arbitrary shape,and has been widely used in machine learning,data mining and other fields.In addition,the research of semi-supervised learning algorithm has gradually become a hot topic.The real structure of data set can be effectively reflected by the prior constraint information and the clustering performance can be improved.The constrained spectral clustering guided by prior knowledge has become a new research direction.The parameters of the similarity matrix of traditional spectral clustering can only be determined by a large number of manual debugging,and the number of clustering needs to be manually determined,which is limited by the initial number of clustering and clustering center.Moreover,the performance of spectral clustering is highly dependent on the similarity measurement of data sets and the construction of similarity matrix.Indepth research and discussion on spectral clustering algorithm based on constraint information are carried out.Specific research contents are as follows:1.The traditional spectral clustering algorithm is limited when dealing with data points with relatively uniform distribution or multiple density peaks.In order to improve this problem,a density adaptive spectral clustering algorithm based on constraint information is proposed.Combining the density information and shared neighbors in the neighborhood,the density adaptive similarity measure is constructed and updated by semi-supervised pairwise constraints.Then the density canopy algorithm is introduced to precluster the data set and update the initial clustering center and clustering number.Through comparison experiments,it is verified that the proposed algorithm can enhance the local and global correlation of data points while weakening the sensitivity of the algorithm.2.Spectral clustering is prone to local feature interference when processing complex manifold structure data.Therefore,an adaptive spectral clustering algorithm with constraint information is proposed based on manifold learning idea.Firstly,based on manifold learning theory,an adaptive neighborhood expansion strategy is designed to adjust similarity matrix,and an improved kernel function of manifold space distance adjustment is constructed.Then,the similarity matrix is improved by means of paired constraint information.Then the p-Laplacian operator is introduced and the improved SSA-Tent algorithm is used to optimize the parameter p.Through the experimental comparison,it is verified that proposed algorithm in chapter 5 automatically obtains accurate classes,which has a good effect on improving the clustering accuracy of spectral clustering processing manifold data.3.The constraint information of real data is difficult to obtain or is often based on expert experience,resulting in poor security of prior information and easy to ignore local information of data.To solve the above problems,a spectral clustering algorithm based on co-occurrence constraint information was proposed.K-means algorithm and co-occurrence theory were used to obtain k co-occurrence pseudo-tags,and the similarity matrix was improved by combining natural neighbor method to guide the spectral clustering process.By comparing with 5 advanced algorithms,proposed algorithm in chapter 5 can more objectively reflect the local similarity between the data and improve the accuracy of clustering.4.Traditional unconstrained spectral clustering ignores labeled data samples.Existing studies often only consider a single form of prior information,which is unable to effectively use precious constraint information to obtain more realistic data information.To dothis,spectral clustering algorithm with extended constraint information is proposed.Firstly,label type data constraints are effectively combined with paired data constraints,and used to enhance the role of constraint information to assist clustering.Secondly,the initial class center is selected according to the density characteristics of sample points and constraint information.In addition,the paired constraint matrix is used for cluster to make more effective with constraint information.Through the experimental comparison,the algorithm proposed in chapter 6 can improve the algorithm sensitivity caused by the randomness of the initial class center of traditional spectral clustering,and improve the accuracy and effect of clustering.Through the above research work,this thesis not only alleviates the limitations of the existing spectral clustering algorithm,but also further expands the research work of constrained spectral clustering by the use of constraint information.At the same time,through the improvement of the similarity measure method and prior information,spectral clustering is more in line with the real data information,which effectively provides new thinking for clustering methods.There are 18 figures,19 tables,and 173 references in this thesis. |