Research And Implementation Of Product Image Classification Based On Deep Learning

Posted on:2021-01-05

Degree:Master

Type:Thesis

Country:China

Candidate:D Shi

Full Text:PDF

GTID:2428330626955814

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

With the development of Internet and mobile communication technology,the pictures of product on the Internet increase at a massive rate.In the engineering practice,it is found that many product pictures do not have the same characteristics as common objects,and the accuracy of image classification by using cnn model is low.However,the text in the image contains rich information,which can be extracted by the text recognition technology in the natural scene.This thesis takes the product image classification algorithm based on deep learning theory as the key research object,taking features in the engineering field into consideration.The main research works are as follows:(1)This thesis first studies the image classification technology.A technical route of multi-modal learning using text information in images is proposed after analyzing bad cases.The convolutional neural network has been brought up to tune on VGG16,Inception,ResNet50 with transfer learning technology,which provides the image classification technique for a multi-modal network.(2)This thesis studies the method of text extraction and recognition from product pictures,and provides text modal data for multi-modal network.The text boxes in the picture of the product are extracted by the text detection model EAST.This thesis also implements the text recognition network CRNN.(3)This thesis studies the text classification algorithm and provides the text classification technique for a multi-modal network.This thesis implements two text classification models,TextCNN and BERT.BERT language model is finally selected to extract text features in engineering practice.After integrating the above three studies,multi-modal learning technology is used to design an algorithm that fuses image features and text features at the feature level.The purpose is to use multi-source data to assist learning to improve classification accuracy,to reduce generalization error.Finally,this thesis uses the Keras(artificial neural network development library)to implement the multi-mode classification algorithm.The multi-mode classification algorithm solves the problem that the CNN network cannot extract effective features for product image.Compared with the method of only-image-information-using classification,the accuracy of the multi-modal fusion inference is improved by 6% on the test dataset.Chapter 6 of this thesis provides a Web service for product image classification.

Keywords/Search Tags:

transfer learning, image classification, image text detection, multimodal learning, deep learning

PDF Full Text Request

Related items

1	Research On Image Classification Method Based On Deep Transfer Learning
2	Research And Applications Of Image-text Multimodal Correlation Learning
3	Research On Key Issues Of Image Classification And Annotation By Fusing Text Information
4	Multimodal Classification System Based On Image And User Access Time Series
5	Research And Application On Deep Transfer Learning Algorithm In Text Classification
6	Research On Image Classification Algorithmvia Transfer Learning
7	Scene Image Text Detection Based On Deep Learning Method
8	Research On Clothing Image Classification And Retrieval Based On Deep Learning
9	Multi-source Deep Transfer Learning
10	Research On Several Issues Of Image Generation And Recognition Based On Deep Learning