Font Size: a A A

Research And Application Of Binary Descriptor Based On Deep Learning

Posted on:2019-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:F Y YangFull Text:PDF
GTID:2438330545456860Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Due to the rapidly development of information,the number of images has increased dramatically.Text-based image retrieval not only costs a large amount of resources,but also cannot accurately represent image features,so it cannot meet people's needs.Deep learning is a method that uses multi-layer neural networks to accomplish various tasks.It can effectively solve the problem of extracting image features.However,since the dimension of network output is too high,the time for feature matching will increase significantly,so it is necessary to propose an image descriptor that can accurately extract image features and efficiently match features.In view of the above problems,this paper studies that it constructs an image binary descriptor by using the deep convolutional neural network model VGGNet16 to describe the image features and reduce the feature matching time in image retrieval.The main work and innovation of this paper are as follows:(1)Using the pre-training model to solve the complexity of the deep convolutional neural network model,combined with GPU parallel computing to reduce the network training time.In the pre-training process,in order to better extract the image features,add two fully connected layers at the fully connected layer and start training network.(2)Replace the VGGNet16 classification layer with a new fully connected layer,and binarize the layer(ie,denoted by 0,1)to obtain a string of corresponding binary descriptors,thereby reducing the high level of the network top output Matching time between dimensions.The bit number of the binary string output from this network is12 bit,24bit,and 32 bit,respectively.Calculating the Hamming distance between binary and binary at this time can increase the speed of image retrieval.(3)In order to better achieve the image retrieval task,the corresponding loss function is designed,the total loss function needs to be optimized in two parts:a.Optimize the quantization loss generated in the binarization process;that is,reduce the quantization loss,so that the error between the high-dimensional real value before quantization and the quantified binary is smaller,which is favorable for maintaining the pre-quantization and post-quantization image features.Similarity.First,the symbolic function is used to convert the network output layer high-dimensional numerical value into binary,which is used to describe the image;then according to the square error formula,the function in the binary value process is reconstructed to reduce the quantization loss.b.Optimize the rotation error generated during the different rotation angles;that is,reduce the error before and after the rotation,which is conducive to maintaining the image rotation invariance.Firstly,with the increase of the rotation angle,the corresponding loss error may be larger.Therefore,the penalty factor is added to discriminate the category of the sample,and then the error value before and after the rotation is optimized according to the principle of error reduction,so as to obtain the user-satisfactory image information.Experiments show that the binary descriptors extracted by this method can describe the image features more accurately and improve their retrieval accuracy.The use of pre-training model effectively reduces the training time,saves computer memory resources,and also overcomes the problem of not having a large number of data sets;calculating the similarity between images using binary values can reduce the amount of calculation of matching;through the optimized loss function can achieve more effective image retrieval.
Keywords/Search Tags:Deep learning, Image retrieval, VGGNet16, GPU parallel computing, Binary descriptor
PDF Full Text Request
Related items