Font Size: a A A

Mining Semantics Across Images

Posted on:2018-12-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z H YuanFull Text:PDF
GTID:1318330542468410Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of Internet and the descending cost of data traffic,people prone to interact on Internet using multimedia data instead of text and the amount of vi-sual data on Internet rises greatly.Abundant data make image understanding and video analysis a pressing need.Recently,researchers have gained much progress on several computer vision problems such as object recognition,scene classification,object detec-tion and image segmentation,which make computer capable of understanding a single image.Different from image semantics,semantics across images are information that can only be expressed by multiple images instead of a single image,e.g.,common cat-egories across images.Compared to single image understanding,mining semantics across images is much harder.There still needs a long way to go for this task because the performance of current systems is unsatisfactory and our research aims to improve both effectiveness and efficiency of it.In this thesis,the author will firstly attempt to discover common categories across images and co-segment them quickly.Then a deep learning technique is introduced to co-segment foreground objects more accurately.Finally,the author will direct his eyes to consecutive frames,i.e.,video,and further discover actions in videos.The main contributions of this thesis are fourfold:1.A generative probabilistic model is proposed to improve the effectiveness of cat-egory discovery.The model can generate appearance of pixels,regions and the entire image in a unified way by assigning a latent topic variable for each of them.Based on the observation that pixels and regions of the same category share similar appearance,category discovery turns to be the inference of latent variables after regarding latent topics as categories.In addition,spatial context priors are intro-duced to model relations between categories and regularise final inference.The entire model is solved under a novel Gibbs EM framework.2.A visual relation network is introduced to improve the efficiency of category dis-covery and image co-segmentation.This method generates several segments for images firstly and then a visual relation network is constructed to link all these seg-ments.In this way,image co-segmentation falls into selecting correct segments for every image and it can be solved by voting on the constructed VRN.This thesis proposes a topic-level random walk algorithm to perform voting and select those segments with high scores.3.A deep-dense conditional random field is proposed to perform object co-segmentation.The method combines deep neural network and conditional random field under a single framework to encode shared information across images.A cooccurrence map is introduced to summarize discovered information for each image and used as extra prior for single image segmentation.4.A structured sum is proposed to represent action in videos and perform action de-tection.Compared to unordered images,consecutive frames reflect more semantics such as action.The thesis addresses action detection in long videos by modelling actions as structured sums.Then the problem is converted to K-largest structured maximum sums and a dynamic program algorithm is introduced to solve it in linear time complexity.The framework is trained in an end-to-end way and gets good per-formance by making full use of powerful representability of deep neural network.In a word,this thesis addresses the problem of mining semantics across images.The task varies from discovering common information in unordered images to understand-ing actions encoded in consecutive frames.A series of novel methods are proposed to improve either effectiveness or efficiency by overcoming several challenges.e.g.,adding context priors to regularise category discovery and making use of deep learning techniques to improve the accuracy of co-segmentation.Experiments indicate that the proposed methods are effective on standard benchmarks.
Keywords/Search Tags:Semantics across Images, Image Co-segmentation, Category Discovery, Video Analysis, Action Detection
PDF Full Text Request
Related items