Font Size: a A A

Research On Semi-Supervised Multi-View Clustering And Two-View Multi-Instance Clustering

Posted on:2022-09-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:H CaiFull Text:PDF
GTID:1488306317494384Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In the data mining era,the data are collected from different sources or extracted by various feature extractors.For example,a piece of news is reported by different news media;one document is expressed by various languages;a micro-video contains images,sounds and subtitles.All these that describe the same object from different perspectives are referred to as multi-view data.In general,each individual view data can be used to design a single-view clustering model,but this model cannot effectively explore the complementary information between multi-view data,and therefore cannot further improve the clustering performance.How to fully explore the complementary information between multi-view data to improve the clustering performance has become an important challenge,and multi-view clustering has attracted widespread attention.In addition,multi-view multi-instance data can also be collected in real life,such as images with text annotations,where the image-text data are all in the form of bags.How to effectively cluster these image-text bags has also aroused people's attention.From multi-view clustering research to two-view multi-instance clustering research based on the image-text,there are still the following shortcomings:(1)The multi-view non-negative matrix factorization model usually performs non-negative matrix factorization on multi-view data,while the multi-view non-negative matrix factorization model belongs to the unsupervised learning model,which cannot effectively use the label information.(2)The multi-view non-negative matrix factorization model cannot guarantee that the feature representation of each view obtained by non-negative matrix factorization has the same scale.The multi-view non-negative matrix factorization model integrates the information from multi-view data by fusing the feature representations of different views,but it will inevitably degrade the clustering performance of the model by fusing the different scale feature representations.(3)Existing models are not suitable for solving the two-view multi-instance clustering problem based on the image-text.In order to address the above problems,this dissertation conducts research on multi-view clustering under the constrained non-negative matrix factorization framework,and research on the two-view multi-instance clustering based on the image-text under the concept factorization framework.The main research works of this dissertation are listed as following:1.Aiming to address the problem that multi-view non-negative matrix factorization cannot effectively use the label information,this dissertation designs a semi-supervised multi-view clustering model based on constrained non-negative matrix factorization.Firstly,based on the constrained non-negative matrix factorization framework,this model uses the label information to construct the label constraint matrix shared by all views,and uses the label constraint matrix to merge the sample points from the same class together in each view.This guarantees that the label information of the samples with the same class is unchanged.Then,the model integrates the complementary information between different views with the help of the co-regularization term,and extracts the robust feature of each view with the help of the sparseness constraint term.Finally,the experimental results on the text multi-view datasets and the image multi-view datasets show the model can significantly improve the clustering performance.2.Aiming to address the problem of information fusion in multi-view non-negative matrix factorization,this dissertation designs a semi-supervised multi-view clustering model based on orthonormality-constrained non-negative matrix factorization.Firstly,with the help of the constrained non-negative matrix factorization framework,this model not only learns the low-dimensional feature representation of each view,but also merges the samples with the same label together in each view.Then,the model introduces a novel orthonormality constraint term to obtain a discriminative normalized feature representation matrix for each view.Subsequently,the model uses the co-regularization term to integrate the complementary information between different views.The final experimental results show that the designed model can obtain the better clustering performance.3.Aiming to address the two-view multi-instance clustering problem on image-text,this dissertation designs a semi-supervised two-view multiple instance clustering model.This model introduces multi-instance kernel into concept factorization,and learns the association matrix of each individual view and the cluster indicator matrix shared by both views.Then,with the help of l2,1-norm,the model can obtain the optimal association matrix and the desirable cluster indicator matrix.Subsequently,in order to enhance the discriminability between bags,the model enforces the similarity between the cluster indicator vectors of the bags with the same label to approximate 1,and the similarity between the cluster indicator vectors of the bags with different labels to 0.The final experimental results show that the designed model can significantly improve the two-view multi-instance clustering performance.
Keywords/Search Tags:Multi-view, Multi-instance, Non-negative Matrix Factorization, Concept Factorization, Orthonormality constraint
PDF Full Text Request
Related items