Font Size: a A A

Research On Image Recognition Of E-Commerce Product Promotion Image Based On Deep Learning

Posted on:2021-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:Q L MengFull Text:PDF
GTID:2428330620463592Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Managers of e-commerce platforms need to extract information from massive commodity pictures of products sold by the shops to achieve risk management and control.Therefore,text identification of product promotion pictures is the key technology of e-commerce platform management.This paper uses deep learning technology to implement text detection and recognition of the propaganda pictures of e-commerce products,which provides effective technical support for e-commerce management.The main research contents of this article are as follows:(1)Aiming at the detection error prone to occur when the CTPN(Connectionist Text Proposal Network)text detection algorithm detects texts of different heights.We propose a text detection algorithm Hy-CTPN that incorporates text height information is proposed on the basis of CTPN.This method improves the CTPN text frame merging mechanism and adds text height information to the text detection frame refinement.We perform experiments on the ICDAR2013 and 2015 datasets.The experimental results on ICDAR2013 show that the recall rate of the HyCTPN algorithm is 85%,the F1-Measure is 89%,and the detection time is 90 ms.Compared with the original CTPN,the recall rate is increased by 2%,and the F1-Measure is increased by 1 %,The average prediction time of the algorithm is shortened by 50 ms.What's more,HyCTPN also performed well on the self-made product promotion image detection dataset;(2)Aiming at the problem of missing detection after Hy-CTPN text detection,a posterior model of random forest text detection based on the features of convolutional autoencoder is proposed.This method combines the convolutional self-encoding features and random forest in deep learning,which uses the random forest method to perform fusion voting on the convolutional self-encoding features,and selects the optimal feature area.The accuracy of HyCTPN text detection by adding a random forest posterior model is 94.9%,Which compared with CTPN,the detection accuracy increased by 1.4%,compared with Hy-CTPN,the detection accuracy increased by 1.3%;(3)For the problem of low text recognition accuracy caused by text tilt after text detection,a perspective transformation correction algorithm is introduced;in order to reduce the time-consuming problem of image binarization algorithm,we proposed OP-Niblack,which is based on the Niblack algorithm,the algorithm time complexity is reduced.Finally,a perspective-transformed text correction algorithm based on OP-Niblack was formed.The algorithm uses the OP-Niblack algorithm to reduce the pixel local window calculation method for binarization,and uses a combination of the gradient method and perspective transformation to correct the text area.We perform correction experiments on the distortion image after text detection.The average correction time of the correction algorithm proposed in this paper is 1.18 s,which is 90 ms less than the better correction algorithms in recent years.We calibrated the text with a recognition accuracy of(30%,50%)and(50%,70%)before correction,and the average recognition accuracy after correction was 81.7% and 86.0%,respectively.The weighted average recognition accuracy improvement is 1.13%.(4)Aiming at the problems of excessive memory consumption and low accuracy of text recognition during training of DenseNet networks,We propose a character sequence recognition algorithm Simi-DenseNet + CTC based on the feature map matrix similarity DenseNet network and CTC(Onnectionist Temporal Classification).We conducted experiments on the text recognition data set of self-made product promotion pictures with an accuracy rate of 86.1% and an average test time of 0.18 s.Compared with DenseNet + CTC,the accuracy rate is increased by 0.8%,and the detection time is shortened by 90 ms.In addition,Simi-DenseNet + CTC has a significant reduction in the training memory footprint.We use the tensorflow framework and the python programming language to implement the above-mentioned deep learning algorithms,and use three public data sets to verify the experiments with our own data sets.The experimental results show that the proposed algorithm not only has a very good effect on the self-made product promotion picture text data set,but also achieves good results on the public data sets ICDAR2013,ICDAR2015,ICDAR2017 data sets,which has very good research and application value.
Keywords/Search Tags:Text detection, Text recognition, Hy-CTPN, Random forest, Simi-DenseNet+ CTC
PDF Full Text Request
Related items