Font Size: a A A

Visual Detection And Classification Based On Deep Learning

Posted on:2018-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:R ZhuFull Text:PDF
GTID:2428330512997259Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of the multimedia and Internet technology,image semantic understanding plays an increasingly important role in people's daily life.The applica-tion of visual detection and classification based on computer vision becomes more and more extensive.Though the research of visual detection and classification has achieved considerable progress,challenges still remain due to the complexity of the image.Re-cently,more and more researchers recognize that the deep learning model has a great advantage in dealing with the image semantic understanding.In this paper,we start from the based structure of the deep learning model including the data layer,the loss function layer and the transfer of the trained model.The contributions of this paper can be summarized as follows:Firstly,we focus on the problem of the input layer.As we all know,the input of the deep learning model need to crop or resize when the images have different sizes.We propose a novel descriptor,SPP-net,extracted by equipping the Convolutional Neural Network(CNN)with spatial pyramid pooling.We first compute the feature maps from the original images without any cropping or warping,and then generate the fixed-size representations for original images.Finally,we apply this model for text detection tasks.Secondly,we introduce a new method called deep metric learning especially for binary problem.We use the triplet loss to replace the traditional loss function(Softmax)and learn a mapping from image regions to a compact Euclidean space where distances correspond to a measure of text similarity.By combining the CNN model with metric learning,we can make reliable binary classification between different samples.Finally,we consider the transfer of the trained model.We propose the hierarchical network model combined with migration learning to solve this problem.In detail,we make use of the open models,taking the model's weights as the initial values.And in the training of the new model,we determine the finetue levels according to the original model and the similarity between the new model.At last,we give a practical strategy for learning rate setting.The effect is significant in the standard datasets.
Keywords/Search Tags:Deep Learning, Spatial Pyramid Pooling, Metric Learning, Transfer Learning
PDF Full Text Request
Related items