Font Size: a A A

Research On Image Feature Representations Based On Deep Neural Network

Posted on:2019-04-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:1368330611993005Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Image feature representation is an important topic in computer vision.It is widely used in many fields such as visual data retrieval,automatic driving,and video copy detection.With the arrival of the era of big data,the scale of visual data is exploding.It is urgent to construct a comprehensive and fine-grained representation and understanding of massive visual data for different applications through image feature representation methods.This urgent demand has prompted the transformation of traditional machine learning methods,and the birth of the deep learning technology.In this paper,based on the hierarchical representation relationship of visual data and deep neural networks framework,we propose different approaches to overcome the shortcomings in image feature representation for different applications.In this paper,the image feature representation is decomposed into three subtasks for different applications: feature representation for image patch,the robust feature representation for image and feature representation for image sequence.In detail,the main achievements of this paper are as follows:(1)A deep feature representation for image patch based on hierarchical multi-scale feature intergradation is proposed.The deep descriptors are mostly based on the deep network framework,but due to the constraints of the CNN structure,these learned descriptors do not perform very well on scale variations.Inspired by the methods of multiscale problem of skeleton detection and object detection in recent researches,based on the feature maps of different layers of CNN structure possess different size receptive fields,a deep descriptor which integrates the feature maps of different scales of the network to effectively integrate features to construct deep descriptors with strong scale robustness is proposed.Firstly,the CNN is employed to encode the image patches,and the matching relationship between the image patches is provided by the batchsize data training technique.Finaly,a deep robust descriptor with strong discriminability is learned.In the experimental stage,the proposed model is evaluated on different datasets such as image patch matching,image retrieval,wide baseline stereo.The experimental results show that the model presented in this paper outperforms the state-of-the-art techniques on multiple evaluation metrics.(2)A local binary rotation invariant image feature representation(RI-LBCNN)is proposed and used to solve the problem of image classification.Convolutional Neural Networks have achieved unprecedented successes in computer vision fields,but they remain challenged by the problem about how to effectively process the orientation transformation of objects with fewer parameters.In this paper,we propose a new convolutional module,Local Binary orientation Module(LBoM),which takes advantages of both Local Binary Convolutional and Active Rotating Filters to effectively deal with the rotation variations with fewer parameters.LBoM can be naturally inserted to popular CNN models and upgrade them to be RI-LBCNNs.RI-LBCNNs can be learned with off-the-shelf optimization approaches in an end-to-end manner and fulfill image classification tasks.Extensive experiments on four benchmarks show that RI-LBCNNs can perform image classification with fewer network parameters and significantly outperform the baseline LBCNN when processing images with large rotation variations.(3)A deep CNN-based feature representation for image sequence is proposed,as well as a video copy detection method based on graph-based sequence matching.In traditional video copy detection methods,most of the algorithms employ traditional manual design features,which rely heavily on the domain knowledge and experience of feature designers.In this paper,a content-based video copy detection algorithm based on deep CNN-based image sequence feature representation is proposed.First,the deep CNN-based feature representation for image sequence is used to encode the visual content of the keyframes of the video data to maintain the image frame level discrimination capability.Through calculating the Euclidean distance between CNN features of different frame image depths,the similarity between frames is obtained.Meanwhile,a keyframe based copy retrieval method is porposed.The method can retrieve candidate copy keyframes from a large keyframe database without establishing the keyframe index.In addition,based on the timing consistency constraint of video data,we employ a graph-based sequence matching algorithm to obtain copy video segments and accurately locate the copy video segments.The experimental results show that the deep CNN features proposed in this paper possess strong discriminating ability.Meanwhile,the validity of the proposed copy video retrieval algorithm is also verified.
Keywords/Search Tags:image feature representation, deep neural network, CNN, scale invariant, rotation invariant
PDF Full Text Request
Related items