Font Size: a A A

Research And Implementation Of Product Image Classification Based On Deep Learning

Posted on:2021-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:D ShiFull Text:PDF
GTID:2428330626955814Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the development of Internet and mobile communication technology,the pictures of product on the Internet increase at a massive rate.In the engineering practice,it is found that many product pictures do not have the same characteristics as common objects,and the accuracy of image classification by using cnn model is low.However,the text in the image contains rich information,which can be extracted by the text recognition technology in the natural scene.This thesis takes the product image classification algorithm based on deep learning theory as the key research object,taking features in the engineering field into consideration.The main research works are as follows:(1)This thesis first studies the image classification technology.A technical route of multi-modal learning using text information in images is proposed after analyzing bad cases.The convolutional neural network has been brought up to tune on VGG16,Inception,ResNet50 with transfer learning technology,which provides the image classification technique for a multi-modal network.(2)This thesis studies the method of text extraction and recognition from product pictures,and provides text modal data for multi-modal network.The text boxes in the picture of the product are extracted by the text detection model EAST.This thesis also implements the text recognition network CRNN.(3)This thesis studies the text classification algorithm and provides the text classification technique for a multi-modal network.This thesis implements two text classification models,TextCNN and BERT.BERT language model is finally selected to extract text features in engineering practice.After integrating the above three studies,multi-modal learning technology is used to design an algorithm that fuses image features and text features at the feature level.The purpose is to use multi-source data to assist learning to improve classification accuracy,to reduce generalization error.Finally,this thesis uses the Keras(artificial neural network development library)to implement the multi-mode classification algorithm.The multi-mode classification algorithm solves the problem that the CNN network cannot extract effective features for product image.Compared with the method of only-image-information-using classification,the accuracy of the multi-modal fusion inference is improved by 6% on the test dataset.Chapter 6 of this thesis provides a Web service for product image classification.
Keywords/Search Tags:transfer learning, image classification, image text detection, multimodal learning, deep learning
PDF Full Text Request
Related items