Font Size: a A A

Research Of Object Detection And Classification Algorithms Based On Deep Visual Representation

Posted on:2022-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:J S Z MoFull Text:PDF
GTID:2518306530973219Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Recent years,benefiting from the related research on deep learning,computer vision filed has achieved explosive and rapid development.As the major approach of the deep visual representation,deep convolutional network boosts the improvements of vision tasks including image classification and object detection.Meanwhile,in complex scenarios,deep learning-based vision model faces with more and more challenging application circumstances.Object detection task requires a model to locate objects in an image by calculating the precise bounding boxes and predicting object categories.The network model based on deep visual representation can complete object detection task by predicting bounding boxes and categorical probabilities.To gain such capability of detection,large scale manual welllabeled data is required and it costs a long time on model training.A weakly-supervised model is hopeful to alleviate this problem.In that,a weakly-supervised detection model turns to facilitate deep visual representation for object detection by using the coarse-grained labels for training the networks instead of using well-labeled datasets,which reduces the requirement of the dataset quality.Fine-grained visual classification task requires a model to mine object detail features from an image for specifying the object a sub-category.One of the popular method is based on localization-then-classification pipeline,which combines the designing experience of detection(localization)and classification models.Currently,fine-grained classification model has some common problems.Main object localization relies on independent detector so that it's hard to train it as an end-to-end model.The key part estimation of objects need to train with extra annotations or set manually,which it's not flexible in some complex scenarios.It's hard for the model to adapt image contexts accordingly causing imbalanced local and global fine-grained features.For the further improvement of object detection and classification in complex scenarios,this thesis mainly researches deep convolutional networks,carries out the experiments on the detection and classification tasks for proposing two solutions.The main contributions of this thesis are concluded as following.1.With the conclusion of the experimental experience,this thesis discusses the key factors that making sense of visual representation quality.These discussions include the concepts from basic operation to integrated network scale,involving convolutional operation,feature activation and model scale.The network constructing and module designing in the subsequent chapters follow the basic rules in this part.2.To attempt to solve the problem of the problem relying on the highly-consuming well-labeled dataset,this thesis proposes ws Det,a weakly-supervised detector model only training with the image categorical labels.Though the exploration of the feature response effect,a self-adaptive bounding box generation module has been designed and a solution for multiscale object-feature matching problem has been discussed.For proving the availability of proposed ws Det,the related experiment has performed on the custom dataset BCA-3.Compared with the supervised detector models,ws Det costs much less time to be trained and achieves a favorable detection performance.3.For improving the visual representation quality of fine-grained classification model,the thesis proposes Global Perception Attention Network(GPANet).GPANet is a weaklysupervised end-to-end fine-grained visual classification model using multiscale feature fusion.There is no need to train this model with the extra key region annotations and to set manually the number of key parts for it.GPANet balances the quality of the highly-semantics and middle-detail representation well.It achieves the state-of-the-art level accuracy in the public benchmark datasets and proves its good generalization ability of the multiple scenarios.By visualizing its feature layer,it is clear to learn how GPANet effectively improve the quality of fine-grained feature representation.
Keywords/Search Tags:Deep learning, Convolutional neural network, Object detection, Fine-grained image classification
PDF Full Text Request
Related items