Font Size: a A A

The Study Of Nonnegative Matrix Factorization And Rough Set Theory

Posted on:2019-06-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:X X ZhangFull Text:PDF
GTID:1368330548969228Subject:System analysis, operations and control
Abstract/Summary:PDF Full Text Request
The uncertainty of high dimensional data is usually caused by sparse and inconsistent data.Sparsity is caused by missing data,and inconsistency is produced by same samples or inconsistent samples which own same features and different labels.To simplify the representation of high dimensional data,this paper employs nonnegative matrix factorization(short for NMF)and rough set to investigate its dimensionality reduction,wherein it mainly focuses on three aspects,including algorithm building,algorithm optimization,and algorithm generalization.The following are the detailed content:1.In this paper,we propose a new NMF related method,called Linear Projection and Graph Regularized Nonnegative Matrix Factorization(short for LPGNMF).This method employs a new regularizer based on the objective function of NMF,wherein the regularizer is established by using a linear projection of the original rating matrix onto the item latent matrix.This projection combines the rating matrix with item latent matrix and therefore provides an advanced regularizer.It also considers the local geometry between samples in low dimensional space,which assumes that similarities between samples and its neighbors will be kept in low dimensional space.We optimize LPGNMF by using multiplicative update rules and alternative update rules,respectively.Experimental evaluations demonstrate the effectiveness of our algorithm in predicting for missing ratings.2.The existing incremental nonnegative matrix factorization algorithms are established based on the assumption that samples are independent,thus it only needs to update the new sample's latent vector when a new sample comes,which may lead to inferior solutions.To address this issue,we remove that assumption and investigate a novel incremental nonnegative matrix factorization algorithm based on correlation,called Incremental Nonnegative Matrix Factorization based on Correlation and Graph Regularization(ICGNMF).The correlation is mainly used for finding out those correlated samples that need to be updated,which will make those unchanged rows of the coefficient matrix are different whenever new samples coming.During the incremental process,we regard samples that are added before as training data when next new sample comes,which is beneficial to minimize the objective function.We derive the updating rules for ICGNMF by using multiplicative iterative update rules.Experimental results show ICGNMF reduces the error better than other methods.3.We develop generalized dominance relation and generalized ? dominance relation based on dominance intuitionistic fuzzy information system and study its attribute reduction,wherein generalized dominance relation only considers the dominance relation between samples' comprehensive values and generalized ?dominance relation adds additional requirements for partial single attributes,controls the number of attributes which satisfy dominance relations,by which we can induce all "at least" and "at most" decision rules.These two kinds of dominance relations provide the new way for data partition and reduce the information loss in the process of extracting rules.4.In this paper,we analyze the generalization error bounds of rough set-based prediction algorithms in terms of algorithmic stability analysis,wherein the bounds numerically imply performance of the proposed rough set-based prediction algorithm is related to the number of rules and stability parameter.The more samples and the smaller stability parameter,the smaller bounds.At the same time,we give the generalization error bounds of confidence and max confidence,min support algorithms(MCMS).The results show the generalization bound of Confidence algorithm is decreasing as increasing of samples and granularity,as well as that of MCMS is decreasing associated with the increase of samples and min support threshold.Greater granularity and min support threshold will lead to larger empirical error,as well as generalization error.Conversely,overfitting occurs between Confidence and MCMS algorithms.Finally,several experiments are performed to test the previous conclusions.
Keywords/Search Tags:Nonnegative matrix factorization, incremental nonnegative matrix factorization, generalized dominance relation, dominance intuitionistic fuzzy information system, stability, generalization error bounds
PDF Full Text Request
Related items