Font Size: a A A

Research On Extremely Sparse Data With Blocks-coupled Non-negative Matrix Fac-torization

Posted on:2018-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:W T ChenFull Text:PDF
GTID:2348330563452431Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of information technology,Internet has become an indispensable part of our life.How to use data mining to help improve user experience and work efficiency from internet big data has been a research hotspot in academy and industry.However,most of the algorithms and applications analyze users based on their historical behavior and information which means that the matrix of user information is extremely sparse.As a result,the performance of those applications and systems are very poor or even not useful at all.So how to use the sparse matrix of user behavior and information to rebuild user information is the key bottleneck of internet data mining.Matrix rebuild especially coupled nonnegative matrix decomposition method provides a new thinking for researchers.The method could improve the sparseness of matrix without changing the original features of the matrix to solve the problem of sparse matrix which has attracted much attention of researchers.With the study of coupled nonnegative matrix decomposition method,there are still problems remain to be solved:1.How to improve the efficiency of ultra-sparse data utilization? If the user information matrix is extremely sparse,Increase the utilization of user data is a very good way to solve sparse data problem.However,the data in the matrix are too scarce and not closely related to each other.In particular,the traditional coupled nonnegative matrix decomposition method also discards the information in some of the original matrices.Therefore,in the case where the user information data itself is already extremely sparse,it is particularly important to take full advantage of all the original matrix information.In order to improving the accuracy of data reconstruction,we need to come up with a way to enhance data features.2.How to keep the data matrix in a block-coupled relationship? How to maintain the coupling relationship between different blocks when the user information matrix is decomposed and reconstructed is very important to accurately reconstruct the user information matrix.However,the traditional methods of coupled nonnegative matrix decomposition didn't take much consideration of such problems.It's very difficult for those traditional methods to solve the problem of maintaining coupling relationship in decomposition process because they simply used the initial value association method.3.How to add a potential relation among data? In practical,data in different fields such as data recommendation,Collaborative filtering and image recovery,generally associated with the relationship.It's important to find out how to use these relationships to build regularized operators in terms of providing strong constraint on the reconstruction of user information matrix and improving the performance of reconfiguration.In order to solving the problems above,this paper provides an algorithm called Blocks-Coupled Non-Negative Matrix Factorization(B-NMF).The main contribution of this algorithm is that:1.To improve the data utilization of ultra-sparse data,we propose B-NMF algorithm in this paper.Based on the spiral block method,the original user information matrix is reconstructed four times in a spiral way without discarding any user information data.In this way,we not only make full use of user information matrix data,but also improve the stability of matrix reconstruction.2.For maintaining the data matrix block coupling relationship,we strengthen the coupling relationship between different blocks in matrix decomposition by introducing the method of decentralized coupling regular term.3.In order to add potential relationships among data,we introduced homogeneity hypothesis which means similar users should have similar expression vectors to increase the homogeneity coefficient regularization constraint for user information matrix reconstruction.Through the user information matrix in potential user relationship,the algorithm provides constraints for user information matrix reconstruction and improves the performance of reconstruction.At last,this paper tests the performance of B-NMF by experiments.Experiment shows that B-NMF performed much better than other mainstream methods compared to current nonnegative matrix decomposition and its extension methods.And B-NMF has very good stability as well.What's more,we tested B-NMF algorithm in collaborative filtering and face detection fields.In super sparse conditions,the performance has significant improved under the reconstruction of B-NMF.Especially in human face detection field,the perform of B-NMF is already comparable to the dense original image.
Keywords/Search Tags:data mining, matrix factorization, NMF, blocks-coupled, homophily regularization
PDF Full Text Request
Related items