Font Size: a A A

Algorithm Research On Character Recognition And Model Acceleration Of Natural Scene Based On Deep Learning

Posted on:2021-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:K ZhanFull Text:PDF
GTID:2518306470466904Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Text is one of the important information in people's life.In addition to the existence of documents,it also appears in a large number of natural scenes,such as the signs of commercial buildings on the streets,various road signs,various billboards,etc.In the field of computer vision,as an important auxiliary technology,extracting the text information from the image provides important information for many popular applications in the future,such as the blind travel assistance system,automatic driving,smart city,etc.,so it has always been of high research value and significance.With alennet playing a big role in the challenge of Imagenet image recognition in 2012,it has surpassed the traditional methods and made deep learning famous,which has been used in scene text recognition by many people.In the natural scene images with text,the general background takes up a large area and is complex,so it is very difficult to recognize the text directly from the image.So we usually detect the text area first,locate the text area,then intercept this small part of the image,and then use the recognition algorithm to get the final result.First of all,this paper uses psenet algorithm to do text detection,and improves it.It uses expansion convolution to expand its receptive field,enhance its long text detection ability,improve the precision and recall rate.The training data comes from the data sets commonly used in related fields downloaded on the network.For text recognition,this paper uses the structure of CNN + LSTM + ctcloss to train,the data uses the text image synthesis program developed by ourselves to automatically generate a large number of text images with rich changes as the training set,and finally can achieve high precision.Although the method based on deep learning has greatly promoted the development of scene character recognition,there is a problem that the calculation of the model is too large,and it will take a long time to get the results on a well configured GPU server,so the research of reasoning acceleration method for deep learning model has emerged.After getting the trained high-precision model,in order to speed up its actual running time,the core innovation of this paper is put forward by investigating the latest research in related fields at home and abroad.The directional weighted model clipping algorithm based on group Lasso is called DWGL,which can greatly eliminate the redundant calculation parameters in the deep learning model so as to realize the acceleration of running time.In the research of algorithm,this paper uses model clipping algorithm dwgl and other cutting-edge algorithms to do comparative experiments on open datasets,achieving a clipping ratio of about 75%,and then writes a paper and publishes it on the international academic conference BMVC(CCF).After proving the correctness and practicability of the clipping algorithm dwgl,the algorithm is applied to psenet and crnn,which are two well trained deep learning models,and the effect of acceleration is obvious.After that,we further investigate the principle and usage of tensorrt,the deep learning acceleration engine of NVIDIA,to further accelerate in the actual project deployment to obtain the most extreme acceleration effect.Through the combination of these two methods,the actual operation speed of psenet and crnn is increased by about 5 times,and the accuracy loss is only 2%.
Keywords/Search Tags:Deep Learning, Text Detection, Text Recognition, Model Compression, TensorRT
PDF Full Text Request
Related items