Font Size: a A A

Research On Big Data Fusion Model Based On Tensor

Posted on:2019-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:W F QinFull Text:PDF
GTID:2348330545492107Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays,information technology is developing rapidly.The scale of data is increasing exponentially.The value of big data is attracting more and more attention.At present,the two key problems existing in the field of big data are how to represent big data as a unified model and reduce the dimensionality of big data efficiently.At present,unstructured data,semi-structured data and structured data have their own representation models,and there is no unified model for the unified representation of three kinds of data.In addition,in the operation of large data,redundant data,inconsistent data and noise data exist in large numbers,making the algorithm processing large data difficult to achieve efficient operation and reduce the accuracy of the calculation results.How to express the large data structure of internal data into a unified and efficient mathematical model,and how to get the high quality core data set by the original data set by reducing the dimension algorithm,which is of great significance to the research of large data.With the development of big data,the application of tensor in big data has been widely concerned.The large data has the characteristics of large data,fast response,variety of data and low value density.By analyzing the characteristics of large data,this paper studies the large data fusion model based on tensor and the method of large data reduction.The main contents of this paper are as follows:First,a big data fusion model based on semi tensor product is proposed.Structured data,semi-structured data and unstructured data have a separate data representation model.The traditional data model can not integrate these data into a single model.In view of the different structure types of the three kinds of data and the characteristics of these three data,this paper proposes a unified large data fusion model based on semi tensor product,which makes structured data,semi-structured data and unstructured data into the same model.In view of the different situations of the three array data appearing in the fusion process,we use semi tensor product to fuse them.This model can not only integrate multi-source and heterogeneous data into a unified tensor model,but also keep the basic characteristics of the original data unchanged.Secondly,a large data reduction method is proposed.In the process of data processing,too much data and too complex intermediate results directly lead to the low efficiency of big data dimensionality reduction.In order to solve this problem,first of all,the large data fusion model of multisource heterogeneous data is segmented,and a larger tensor model is divided into several small tensor models,and then each small tensor model is incrementally reduced.That is,first the tensor after each segmentation is expanded and then the expansion matrix is projected.In the vector basis space of the original matrix,we finally get the core tensor that can replace the original tensor.Experimental results show that the algorithm has the advantages of low complexity and low time complexity,and the core data set greatly improves the quality of big data.Finally,an improved data reduction algorithm based on non-negative matrix factorization is proposed.Non-negative matrix decomposition,that is,the component after decomposition is non negative,and can reduce the dimension of data at the same time,and can further reduce the core tensor.First,the core tensor is expanded to decompose each matrix after the expansion,and then the decomposed matrix is merged to get the core tensor with smaller redundancy and higher data quality.In the non-negative matrix decomposition of tensor model,the data dimension of the model can be reduced,so that the sample set and the original data set can ensure the consistency of the data distribution.
Keywords/Search Tags:big data, tensor model, dimensionality reduction, mapreduce
PDF Full Text Request
Related items