Font Size: a A A

Research On Clustering Algorithm Based On Multi Information Feature Fusion

Posted on:2022-12-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z X LiangFull Text:PDF
GTID:1488306779482654Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Clustering algorithm is one of the most important research branches in machine learning.At present,we are in the era of information explosion.The sample data can be easily obtained through the Internet,social media,public offering and many other methods.However,the label process of the raw data is very time-consuming and labor-consuming.Therefore,to separate the raw data into independent clusters firstly becomes a better choice.Also,clustering can lays the foundation for the subsequent marking,recognition,classification and other algorithms.After years of development and progress,face recognition has obtained very good achievements both in academia and industry.However,there are still some challenges in face recognition,such as face feature recognition,extraction and similarity estimation of samples corrupted by different light sources,shadows,expressions,covers and noise.Traditional clustering algorithms are generally based on the single information feature of the objects,which is easy to fall into the local cognition,which is same as the blind people touching the elephant or peeping into the leopard.The fusion learning model with multiple information features can integrate more dimensions of information,obtain a more comprehensive understanding of the research object,and then improve the performance of clustering algorithm.The main research goal of this thesis is to propose a new interdisciplinary information feature extraction technology,fuse the new information features with various knowledge of the data itself,and finally propose a new efficient and feasible machine learning fusion clustering algorithm model.According to human cognitive intuition,the faces of different people are distinguished based on their facial contour,skin texture and other information.These high-value information for distinguishing faces mainly exists in the areas with sharp pixel changes in the face image.The worthless information of the image,such as random noise,useless components,light and shadow,mask and other interference noise,are difficult to separate in the original spatial domain,but can be captured and separated more conveniently in the frequency domain.Based on this analysis,in this thesis,the image samples are transformed from the original two-dimensional spatial domain to the frequency domain,and the high-value information is extracted through the relevant technology of signal processing.At the same time,the useless interference information is captured and filtered out to extract a new image information feature,which is named “High-Frequency Texture Component”(HFTC).Firstly,by fusing high-frequency texture component information with data space manifold information,a new fusion spectral clustering algorithm model High-Frequency Spectral Clustering(HFSC)is proposed.Specifically,in HFSC,the high-frequency texture component of all samples is extracted firstly,and a new distance estimation method is proposed to measure the similarity between sample points according to the characteristics of high-frequency texture component information.Based on this similarity,the manifold graph structure of data is constructed.According to the graph structure,traditional spectral clustering is applied to obtain the final cluster labels.According to the experiments on real face data sets,HFSC has the advantages of easy implementation,high efficiency and high precision.Also,in order to further apply the information feature fusion theory to the industrial implementation,an application scheme is designed based on the characteristics of HFSC algorithm,which is capable of processing the low-quality image of the integrated circuits.This scheme can segment,locate,extract features and classify the low-quality electron microscope scanning images of chips,and provide good data foundation for the subsequent verification process such as defect and hardware Trojan detection.The results of the experiments on the simulated datasets show that HFSC algorithm has the advantages of high efficiency,high accuracy and high robustness,and has very high value for application.Subsequently,by fusing the high-frequency texture component information with the lowrank information of the data,a new fusion representation of face data is proposed,which is called High-Frequency Low-Rank Representation(HFLRR).The representation integrates the low-rank information of the original data and the high-frequency texture components through a unified optimization problem framework.During the procedure of iterative solution,the optimal fusion representation is obtained.Specifically,in HFLRR,the rank of data matrix,the sparse noise and the difference between data representation and high-frequency texture component matrix are modeled by a unified optimization framework,and the optimal solution is obtained by the alternating direction multiplier method.The fusion representation learned by HFLRR is applied to the real face data sets for subspace clustering.The experimental results of clustering show that the low-rank subspace representation combined with high-frequency texture information has better efficiency and achieves better performances.Finally,in order to further explore the essential characteristics of face image data in the frequency domain and reveal the mechanism of the high-frequency texture component,a subspace structure recovery algorithm based on the fusion of high-frequency texture components and a low-rank constraint is proposed,which is called Frequency domain Low-Rank Subspace Recovery(FLRSR).It is widely known that it can be more effectively to filter out the worthless information,noise interference and extract high-value information in the frequency domain.Therefore,it is believed that the frequency domain signal of the image sample has a better lowrank property and closer to the real subspace structure than that in the original domain.Therefore,experiments are designed to prove the excellent low-rank property of high-frequency texture information and explore the influencing factors.Finally,a subspace recovery framework based on the low-rank property of frequency domain is proposed.Experiments on real face data sets show that the low-rank subspace based on data frequency domain has better block diagonal structure and achieves better clustering performance than the comparison techniques.
Keywords/Search Tags:Face recognition, machine learning, information fusion, spectral clustering, subspace learning
PDF Full Text Request
Related items