Font Size: a A A

Study Of High-dimensoinal Data Modeling Based On Tensor Decompositions

Posted on:2018-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:L Y TangFull Text:PDF
GTID:2348330518995430Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapidly development of the Internet, hundreds of millions of information has been generated every day. A lot of information constitute a complex network of relationships. Because the amount of information on the Internet is increasing, the interrelationship between information is becoming more and more complicated. To express the relationship between people-to-people, people-to-information and information-to-information, we need to establish a high-dimensional tensor model. However, for extracting features from tensor data, the high-dimensional tensor data was converted into a matrix form in traditional methods, which was easy to cause the loss of data structure and information, and finally affected the practical application effects.Many scholars are going to keep focus on tensor modeling directly in recent years. Among them, tensor low-rank approximation is the basis of tensor analysis, and many tensor-based data models need tensor low-rank approximation, that is, tensor decomposition technique.In this paper, we study the main algorithms of non-negative tensor decomposition as well as its probabilistic graphical model (PGM), and use the NTF technology in the practical problems, such as personalized recommendation, image and video's application. Firstly,we investigate the related research about tensor decomposition. Next, we study carefully the non-negative tensor decomposition based on minimizing Euclidean distance and minimizing KL divergence in CP model and Tucker model.The main contributions and innovations of this paper are as follows:1. We improve the non-negative sparse tensor decomposition based on multiplicative update. For a sparse tensor that containing a large number of zero-valued elements, we optimize the space complexity by using the serialized tensor and optimize the time complexity by sorting the sparse serialized tensors. Finally, non-negative tensor decomposition reach the linear complexity in space and time.2. We propose a new EMTC algorithm based on non-negative tensor decomposition. EMTC is a tensor completion algorithm depends on Expectation Maximum algorithm, which can solve the problem about the loss of tensor data such as color image data.3. We summary the various non-negative tensor decomposition algorithm and give interprets of its probabilistic representation. On one hand, the NTucker-KL and NCP-KL algorithms are equivalent to the tensor aspect model (TAM). On the other hand, the NTucker-EU and NCP-EU algorithms are expressed by the Gaussian probability model,namely, probabilistic tensor factorization (PTF). At the same time,we give these algorithm's probabilistic graphical model in the paper.Based on the above research, it is proved that non-negative tensor decomposition has good effect in personalized recommendation, image and video application in our experiments. Details as follows:1. We apply tensor aspect model in the personalized citation recommendation, and the improved sparse non-negative tensor decomposition is used to solve the problem. In this paper, we compare non-negative tensor factorization method with counting keywords, PLSA algorithm based on keywords, collaborative filtering based on author's read behaviors. The NCP-KL and NTucker-KL algorithm get a relatively good results. In addition, we also study the effect of different rank in the NCP-KL algorithm. When the rank reaches a certain size, the information in the original data set can be most fully excavated.2. We apply the tensor decomposition technology in image and video area. The EMTC algorithm is applied to image completion and we compared the results of EMTC with CP-WOPT and LRTC algorithm.The results show that the EMTC algorithm has a good effect on data filling and stability. Beyond that, we discuss the influence of different initialization values in the EMTC algorithm. To some extent, reducing the number of iterations in the M-step can avoid the initial value's effect in the EMTC algorithm. For video application, we transform the video in tensor form directly. After modeling the video by using NCP-KL and NCP-EU algorithm, we can distinguish the different scenes in video by analyzing the 3 factor curves in factor matrix about time axis.
Keywords/Search Tags:tensor decomposition, probabilistic tensor model, personalized recommendation, image completion, scene segmentation
PDF Full Text Request
Related items