Multi-feature Representation Learning And Its Application To Multimedia Data Prediction

Posted on:2018-10-30

Degree:Doctor

Type:Dissertation

Country:China

Candidate:P G Jing

Full Text:PDF

GTID:1368330596497234

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

The rapid development of mobile Internet and hardware acquisition equipment is engendering an exponential explosion of multimedia data.Multimedia data become very rich,diversified and complicated.Multimedia is not limited to the form of a single media,but in the form of cross-media.With the advent of the era of Artificial Intelligence 2.0,multimedia analysis gradually shifted to demand-driven multimedia intelligence analysis.Multimedia prediction plays an important role in many real-world applications.Based on the deep analysis of current domestic and foreign research,this thesis takes the high-order time series,single media data and cross-media data as our research subjects,and conduct a deep research on multimedia feature representation.The main contents and innovations of this paper are summarized as follows:1.We proposed two novel models Multilinear Orthogonal Autoregressive(MOAR)and Multilinear Constrained Autoregressive(MCAR)for high-order time series prediction tasks.MOAR is designed to preserve as much information as possible of original data with orthogonal constraint.MCAR is an enhanced method,which is developed by adding an inverse decomposition error item as the constraint.For both two models,we project the original tensor into subspaces spanned by basis matrices to discover the intrinsic temporal structure of the original tensor.To better preserve the temporal smoothness between consecutive slices of the tensor,the projection matrices are jointly learned by introducing an autoregressive(AR)model.2.We proposed a joint low-rank and sparse regression(JLRSR)framework for image memorability prediction.JLRSR aims to jointly learn a low-rank projection matrix that enables us to decompose the original data into a component part and an error part and a regression coefficient vector for image memorability prediction.The projection matrix and the regression coefficients are bound by a sparse constraint to make our approach sufficiently invariant to the training samples.Moreover,a graph regularization term is constructed to improve the generalization performance and prevent overfitting.3.We proposed a novel framework called Multi-view Transfer Learning from External Sources(MTLES)to predict image memorability.In this framework,we simultaneously leverage different types of visual feature sets and multiple types of predefined image attributes derived from external sources.Specifically,to enhance representation ability of visual features,we constructed connections between visual feature sets and high-level image attributes by transferring attribute knowledge from external sources.MTLES integrates weak learning through external sources,transfer learning,and multi-view consistency loss with different types of feature sets into a joint framework.To better solve this joint optimization problem,we further develop an alternating iterative algorithm to deal with it.4.We focused on popularity prediction of micro-videos by presenting a novel lowrank multi-view embedding learning framework.We named it as transductive lowrank multi-view regression(TLRMVR),and it can boost the performance of microvideo popularity prediction by jointly considering the intrinsic representations of the source and target samples.In particular,TLRMVR integrates low-rank multiview embedding and regression analysis into a unified framework such that the lowest-rank representation shared by all views not only captures the global structure of all views,but also indicates the regression requirements.The framework is formulated as a regression model and it seeks a set of view-specific projection matrices with low-rank constraints to map multi-view features into a common subspace.In addition,a multi-graph regularization term is constructed to improve the generalization capability and further enhance the robustness of the proposed algorithm.

Keywords/Search Tags:

Multimedia data prediction, Feature representation, time series, Tensor decomposition, Image memorability, Micro-video poularity

PDF Full Text Request

Related items

1	Research On Learning Low-rank-sparse Feature Representation For Image Memorability Prediction
2	Research On Time Series Data Analysis And Network Compression Based On Tensor Calculation
3	Study Of Tunnel Sensor Data Prediction Based On Time Series Analysis
4	Research On Image Memorability Prediction Method Based On Low-rank Representation Learning
5	Research On Tensor-based Video Watermarking Algorithms
6	Research On Dimensionality Reduction And Prediction Methods In Time Series Data Ming
7	Tensor Representation And Semantic Modeling For Image Annotation
8	Research On Video Memorability Prediction Based On Multimodal Feature Fusion
9	Research And Application Of Multistep Prediction Method For Time Series Data Based On RNN
10	Research On Image Denoising Algorithm Based On Statistical Analysis And Tensor Decomposition Model