Learning Conditional Models for Visual Perceptio

Posted on:2019-07-23

Degree:Ph.D

Type:Dissertation

University:Cornell University

Candidate:Veit, Andreas

Full Text:PDF

GTID:1478390017489126

Subject:Computer Science

Abstract/Summary:

In recent years, the field of computer vision has seen a series of major advances, made possible by rapid development in algorithms, data collection and computing infrastructure. As a result, vision systems have started to be broadly adopted in everyday applications. Progress has been particularly promising in image recognition, where algorithms now often match human performance. Nevertheless, vision systems still largely fall behind humans in their ability to understand the complexities of the visual world and its apparent contradictions. For example, an image can carry different meanings to different people in different contexts. However, being often limited to a single point of view, vision systems tend to focus on the meaning that dominates in the training data.;In this dissertation, we address this limitation by building conditional vision models that can learn from multiple points of view and adapt their results to account for different conditions. First, we address the related tasks of image tagging and tag based image retrieval. In particular, we build a system that can take into account the fact that people may associate different meaning with certain images and tags. Thus, the system can personalize outputs for ambiguous tags such as #rock, which could refer either to a music genre, a geological object or even outdoor climbing. Further, we focus on the task of image based similarity search. Specifically, we design a system that can understand multiple notions of similarity. For example, when searching for related items to an input images of a shoe, users might be interested in shoes of similar color, style, or for the same kind of activity. By capturing the multitude of aspects in terms of which objects can be compared, our system can find the right set of related items. Lastly, we explore how the underlying convolutional networks themselves can be made aware of the context in which they are used. In a study, we first discover a new understanding of the roles that individual layers take on in modern convolutional networks. Then, we leverage our insights and design a network that can adaptively define its own topology conditioned on the input image to increase both accuracy and efficiency.

Keywords/Search Tags:

Image, Vision

Related items

1	Design And Implementation Of Image Measurement System Based On Machine Vision
2	Research On The Identification And Detection Of Parts Based On Machine Vision
3	Research On Learning-Based Low-Level Vision Problem
4	Research On 3D Reconstruction Technology Based On Stereoscopic Vision
5	Research On Modified Algorithm Of Soccer-Robot Vision Subsystem Of MiroSot
6	Research Of The Detection System Of The Pulse Image Information Based On Binocular Vision
7	Computer Vision Theory And Applications Of Three-dimensional Reconstruction
8	Measurement Of Cow Breast Volume Based On The Analysis In Ultrsound Image And Binocular Vision
9	Based On Machine Vision, Image Recognition System Of Drosophila Compound Eye Disease Study
10	Key Technology Reserch Of Vision Sensor Based On DSP