Non-negative matrix factorization (NMF) is a novel dimension reduction methodwhich has been widely used in pattern recognition, data mining and information retrievalsince the end of the twentieth century. Distinguished from other matrix decompositionmethods such as singular value decomposition, QR decomposition, Cholesky decompo-sition, LU decomposition, and Eigen-value decomposition, NMF approximates a non-negative matrix by the product of two low-rank non-negative matrices. Since NMF rep-resents sample by an additive combination of features, it yields natural sparse and parts-based representation. Such data representation is consistent with the physiological andpsychological evidence in human brain, and thus NMF usually suppresses the noises. Inaddition, NMF based data representation is consistent with the physical hypothesis in thereal-world because it allow neither negative entries in the features. Therefore, NMF iswidely used in text mining, email surveillance, blind source separation, video analysis,face recognition, image annotation, image segmentation, spectral analysis, gene micro-array analysis etc. Recently, NMF has received more and more attentions from severalinstitutes such as the Bell Laboratory, University of Tennessee, Wake Forest University,Georgia Institute of Technology, University of Helsinki, RIKEN Brain Science Institute,National Tsing Hua University, and Zhejiang University. Their significant works extendthe applications of NMF to many fields such as Internet, Information Security, RemoteSensing and Mathematical Modeling. Thus studies on NMF are practically important.The thesis first builds up a non-negative patch alignment framework (NPAF) whichunderstands existing non-negative dimension reduction algorithms in a unified viewpointand helps developing new NMF extensions. Then we proposed non-negative discrimi-native locality alignment (NDLA) by using NPAF. NDLA overcomes several drawback-s of NMF and improves its classification performance. Thirdly, to overcome the slowconvergence deficiency of traditional NMF optimization algorithms, we proposed a fastgradient descent (FGD) method which searches the optimal stepsize along the rescalednegative gradient direction by using Newton method. Fourthly, to overcome the defi-ciencies of traditional optimization algorithms for non-negative least squares (NLS), weapplied the optimal gradient method (OGM) to optimizing NLS which converges at therate of O(1/k2) without line search. Furthermore, we proposed an efficient NMF solver as well as an efficient NPAF solver based on OGM. Finally, we proposed an online NM-F optimization algorithm based on robust stochastic approximation (RSA) to reduce thehigh computational overhead of traditional NMF optimization algorithms when runningon streaming data. Benefit from the RSA algorithm, the proposed online NMF algorithmperforms robustly in practice. There are following main contributions:1. Non-negative patch alignment framework Since NMF was proposed, there weremany works on NMF models. In particular, Li et al. proposed local NMF (L-NMF) which yields parts-based features in most case based on the fact that localvisual features are approximately orthogonal. To incorporate the geometry struc-ture embedded in the data, Cai et al. proposed graph regularized NMF (GNMF).To incorporate the label information, Zaferiou et al. proposed discriminative NMF(DNMF) by using Fisher’s discriminant criteria. However, all these methods aredesigned by researcher’s intuition and experience, and thus it is difficult to under-stand their intrinsic difference. The thesis proposed a non-negative patch alignmentframework which interprets NMF and its extensions from a unified viewpoint. N-PAF not only helps engineers to select suitable models in practice, but also helpsresearchers to develop new NMF models under the NPAF. By using the Lagrangemultipliermethod,wedevelopedamultiplicativeupdaterule(MUR)foroptimizingNPAF. We proved the convergence of MUR by using auxiliary function technique.Note that MUR could optimize most NPAF based models including the traditionalNMF model.2. Non-negative discriminative locality alignment From the viewpoint of NPAF, L-NMF builds local patch for each sample and base vector by themselves and retainsas much energy as possible in their part optimizations. GNMF builds local patchfor each sample by itself and few limited nearest neighbors and preserves the localgeometric structure in the part optimization, but it completely ignores the label in-formation. Although DNMF incorporates the discriminative information, the builtpatches are global and thus DNMF performs not well especially when data does notobey the Gaussian distribution. We proposed non-negative discriminative localityalignment to overcome the aforementioned drawbacks. NDLA builds two types ofpatchesforeachsample,i.e.,inner-classpatchandbetween-classpatch,whereintheinner-class patch composes of itself and its limited nearest neighbors form the same class as the given sample and the between-class patch composes of itself and itslimited nearest neighbors from the different classes against the given sample. Thepartoptimizationontheinner-classpatchminimumsthedistancesbetweensamplesin identical class to preserve the local geometric structure while the part optimiza-tion on the between-class patch maximums the margins between different classes toincorporate the discriminative information. We alignment both part optimizationsby using the alignment strategy and arrived at the objective of NDLA by combiningthem together. Note that NDLA can be optimized by using the proposed MUR forNPAF.3. Fast gradient descent for NPAF To overcome the slow convergence deficiency ofMUR, we revisited the proposed MUR from the viewpoint of gradient descent andproposed a fast gradient descent method for NPAF. FGD searches the optimal stepsize along the rescaled negative gradient by using the Newton method and updatesthe matrix factor without exceeding the positive orthant. As FGD uses the optimalstep size, it greatly accelerates the convergence of MUR. By using the Jesen’s in-equality, we proved the convergence of FGD. However, the optimal stepsize mayreduce to1in some cases which makes FGD degenerate to MUR. To overcomessuch deficiency, we set one stepsize for each column (or row) of matrix factor andproposed a multiple stepsize FGD (MFGD). In MFGD, the optimal step size vectoris obtained by using the multivariate Newton method. To reduce the computationalcomplexity of MFGD, we advanced the step size setting strategy and proposed bal-anced multiple stepsize FGD (BFGD). Note that the convergences of both MFGDand BFGD can be proved by using the Jesen’s inequality.4. OptimalgradientmethodforNPAFByprovingtheconvexityofeachsub-problemof NPAF, we applied the optimal gradient method to alternatively optimizing thematrix factors. The NMF optimization algorithms have received much attention.There exist many methods such as multiplicative update rule, projected gradien-t, quasi-Newton, and active set method. However, they suffer from one or morefollowing drawbacks:(1) slow convergence;(2) high computational complexity;and (3) numerical instability. Therefore, how to efficiently optimize NMF is stil-l an open problem. The thesis proves that each sub-problem of NMF is convexand the gradient is Lipschitz continuous, and thus the optimal gradient method can be applied to optimize each matrix factor at the optimal convergence rate O(1/k2)without line search. Similarly, by proving the convexities of both sub-problems ofNPAF, we proposed an efficient NPAF optimization algorithm based on OGM.5. OnlineRSAalgorithmforNMFSincethespacecomplexityofthetraditionalNM-F optimization algorithms is proportional to the sample dimensionality and samplenumbers. It prohibits NMF on the streaming dataset. In addition, the traditional N-MFalgorithmsshouldberestartedonthearrivalofnewsampleorchunkofsampleswhich brings in increasing huge time overheads. Therefore, someone proposed on-line NMF algorithms which update both matrix factors on the arrival of new sampleor chunk of samples. However, the existing online NMF algorithms suffer from nu-merical instability problem caused by noise of data and rank-deficiencies of matrixfactors. Thethesisproposedanewonlineoptimizationalgorithmwhichupdatesthebasis matrix by using robust stochastic approximation on the arrival of new sam-ple or chunk of samples. The RSA algorithm converges at the rate of O(1/√k) inmost cases and greatly enhances the robustness of online NMF. By using the theoryof quasi-martingale, we proved the convergence of the proposed online optimiza-tion algorithms. To overcome the high space complexity problem, we introduced abuffer to store limited historical samples and advanced the RSA based algorithm.It replaces the oldest sample to incorporate enough statistical information into thebasis matrix on the arrival of new sample. |