Font Size: a A A

Non-negative Low-rank And Group-Sparse Matrix Factorization And Application In Image Retrieval

Posted on:2016-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:S Y WuFull Text:PDF
GTID:2348330536967712Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the Big Data Era,people have collected large amounts of data in daily life and industry.However,the valuable information in the big data is relatively less,so we need to develop the technology of big data mining.With the deployment of national strategy ‘Internet plus',data will be collected in a more wide way,such as the multimedia information including audios and videos,pictures,webpages and so on.Under the explosion trend of data,big data mining transforms data into practicable valuable information.The improvement of the Human Genome Project,the Human Protein Project and the Brain Project piles up the biology data rapidly and the big data mining assists the medical treating and improves the health with the data from the microcosm.The development of the machine learning techniques shad light to the optimization of big data mining.In recent years,deep learning comes to front as a hot topic.However,deep learning needs large amounts of training samples to tune the model and make the ratiocinations and predictions of relative small amount of samples.The constraint is too hard.Traditional techniques of machine learning need soft constraint on the scale of the training samples and they can be regarded as an effective complementarity of the popular deep learning schema.Therefore,this paper studies the traditional machine learning techniques in the application of big data mining.Traditional techniques of machine learning have plenty of methods,such as Support Vector Machine,Linear Discriminative Analysis,K-means and so on.They did not perfectly reach our demand because of the redundant features of high dimensional data,so we need develop a new method for data dimension reduction to extract the critical features from the redundant high dimensional data.Principal Component Analysis is a typical method of data dimension reduction.However,when we process the non-negative data,Non-negative Matrix Factorization performs better than PCA because of the extraction of local features in spite of PCA is one of the state-of-the-art methods.Non-negative Matrix Factorization has been widely applied in many fields and paid more attention by researchers at home and abroad.However,NMF cannot capture the cluster memberships among examples meanwhile remain immune to the outliers.In this paper,we proposed a new NMF model,i.e.,Non-negative Low-rank and Group-sparse Matrix Factorization(NLRGS)and the main works are listed as follows:1)The low-rank and sparse decomposition recovers the low-rank part of the data from the corrupted observations and meanwhile captures the outliers,i.e.,sparse component.In order to obtain the relationship in sample classes,identify the outliers and retains the property of NMF,i.e.,non-negative and part based representation,NLRGS incorporate the low rank constraint and the group sparse constraint with NMF.2)The objective function of NLRGS is non-convex and not easy to be solved,which contains many equality constraints,low-rank constraint and the group sparse constraint.We apply the augmented Lagrangian method to remove the equality constraints of the objective and optimize the transformed objective,then construct the optimization algorithm for NLRGS.3)Learning the cluster centroids with NLRGS for Content Based Image Retrieval(CBIR).Moreover,based the optimization of NLRGS,we develop the method of image coding.4)To solve the problem of large-scale image retrieval,we propose the parallel NLRGS and parallelize the process of SIFT-based content based image retrieval.The results of experiments show that NLRGS performs not only better on face data sets than traditional NMF methods,but also better than traditional methods of K-means-based image retrieval in CBIR.
Keywords/Search Tags:Non-negative Matrix Factorization, Non-negative Low-rank and Group-sparse Matrix Factorization, Augmented Lagrangian method, Scale Invariant Feature Transformed, Content Based Image Retrieval
PDF Full Text Request
Related items