Font Size: a A A

Research On Limited Supervised Learning In Computer Vision

Posted on:2020-01-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:S QiuFull Text:PDF
GTID:1368330620458553Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,recognition tasks in computer vision,including image classification,object localization and semantic segmentation,have achieved promising results with the development of the supervised machine learning algorithms.However,in order to be used in practical tasks,such machine learning systems need a large number of accurately labeled training data to ensure the good performance and robustness.And obtaining a large number of accurately labeled train-ing data requires high time and labor costs.Therefore,exploring the machine learning methods with limited supervisions,that is only a small number of annotations,can help reduce the time and labor costs required to obtain precise annotated samples.This paper focuses on the lim-ited supervised machine learning methods for three recognition tasks in the field of computer vision.Concretely,this paper proposed new semi-supervised learning classification algorithm,weak supervised target localization algorithm and few samples semantic segmentation algo-rithm.Comprehensive experiments verified these proposed methods.The main works in this paper include:1.Fast flexible manifold embedding algorithms for the graph-based semi-supervised clas-sification task.The first problem to be solved in this paper is the large-scale graph-based semi-supervised learning for multi-class classification.Most existing scalable graph-based semi-supervised learning methods cannot cope with the unseen samples or are based on the hard linear constraint,which limits their applications and learning performance.To this end,we build upon the previous work flexible manifold embedding?FME?[1]and propose two novel linear-complexity algorithms called fast flexible manifold embedding?f-FME?and reduced flexible manifold embedding?r-FME?.Both of the proposed methods accelerate FME and inherit its advantages.Specifically,the proposed methods address the hard linear constraint problem by combining a regression residue term and a manifold smoothness term jointly,which naturally provides the prediction model for handling unseen samples.To reduce computational costs,the underlying relationship between a small number of anchor points and all data points is used to construct the graph adjacency matrix,which leads to simplified closed-form solutions.The re-sultant f-FME and r-FME algorithms not only scale linearly in both time and space with respect to the number of training samples but also can effectively utilize information from both labeled and unlabeled data.Experimental results show the effectiveness and scalability of the proposed methods.2.In the framework of fully convolutional neural network,a weakly supervised pixel-level object localization method based on the global weighted average pooling is studied.The second problem to be solved in this paper is the simultaneous pixel-level localization and image-level classification with only image-level labels for fully convolutional network training.In the past,the global max pooling and average pooling methods were used.Because of their hard coding and non-learnability,these two methods are difficult to indicate the precise regions of the target objects in the weakly supervised learning process.To this end,this paper focuses on the global pooling method which plays a vital role in the task of weakly supervised pixel-level object localization.The application of global weighted average pooling?GWAP?method for this task is explored.The class-agnostic GWAP module and class-specific GWAP module are proposed in this paper.The classification and pixel-level localization ability of the proposed method is evaluated on the ILSVRC benchmark dataset.Experimental results show that the proposed GWAP module can better capture the regions of the foreground objects.In addition,the knowledge transfer problem between the weakly supervised image classification task and the region-based object detection task is further explored.A multi-task framework that combines the proposed class-specific GWAP module with R-FCN is proposed.This framework is trained with few ground truth bounding boxes and large-scale image-level labels and evaluated on the PASCAL VOC dataset.Experimental results show that this framework can use the data with only image-level labels to improve the generalization of the object detection model.3.The extrinsic intrinsic correlation network with context information for few-shot se-mantic segmentation.Because of the high cost of precise pixel-level labeled sample collection,the task of semantic segmentation with few samples has attracted widespread attentions in re-cent years.Since fine-tuning a pre-trained segmentation network using a few labeled images is prone to overfitting,the two-branched network was proposed to handle this problem,in which the support image branch guides the semantic segmentation process of the query image branch.However,previous works only considered the feature similarity between the support image and the query image,and failed to make full use of the self-similarity of the query image.The third problem to be solved in this paper is how to make better use of the query image information and integrate the support image information to improve the performance of few-shot semantic seg-mentation.In this paper,a novel extrinsic intrinsic correlation network?EICNet?is proposed,which combines information from both support and query images.Besides,two extra effective strategies are also proposed,including combining global context feature and using two scaled query inputs to further improve the performance.Extensive experiments are conducted on the benchmark dataset Pascal VOC 2012-5.Compared with the baseline network,the proposed EICNet?full version?improves the performance by 3.6%in the one-shot setting.Abundant con-trolled experiments prove the effectiveness of each design proposed in this work.Experiments show that the intrinsic correlation of the query image provides supplementary information for better performance of the few-shot semantic segmentation.The proposed network effectively utilizes this information.
Keywords/Search Tags:Computer vision, Graph-based semi-supervised learning, Weakly supervised object localization, Few-shot semantic segmentation
PDF Full Text Request
Related items