Font Size: a A A

Research And Application Of Multi-angle Text Detection Algorithm Based On Deep Learning

Posted on:2022-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhouFull Text:PDF
GTID:2518306326971569Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of Internet and smart appliances,the natural scenes can be recorded as pictures by shooting,and those pictures can be spread and shared through the Internet.The natural scene image contains rich text information.The text information extracted from the scene image using deep learning can be used in many fields such as intelligent transportation navigation,text retrieval and translation,bank bill recognition and many other fields.Due to the influence of the shooting angle of intelligent devices,most of the text in natural scene pictures are arranged in different angles,which makes it difficult to detect.How to extract multi angle text information from the scene pictures accurately becomes a hotspot issues in text detection.In this thesis,the task of the task of multi angle text detection in natural scenes is deeply studied.Firstly,in order to expand the Chinese data sample,the images suitable for text content detection in natural scenes such as nameplates,license plates,street view shops and road signs are collected and labeled,and a Chinese and English mixed data set TDS suitable for multi angle text detection is constructed,which is used as the data basis of this experiment with ICDAR2013?ICDAR2015.Secondly,in order to solve the problems of disordered text direction,low feature extraction efficiency caused by complex scenes and low detection accuracy caused by text angle,based on deep learning technology,three tasks of multi angle text image text direction prediction model,feature extraction model and detection model are studied.Finally,in order to reflect the effectiveness of the research method,the three tasks of the research are proposed This method is applied to a multi angle text detection system.The main work of this thesis is as follows:(1)The multi angle text classification model and feature extraction model based on deep learning are studied.VGG16 is used for image text direction classification.Aiming at the problem of over fitting caused by parameter redundancy,the hybrid pruning method is improved to remove some parameters with poor importance ranking,and find out a set of optimal parameter subsets,so as to compress the model parameters.TDS data set is used for training.Compared with the training results of the original network and other pruning methods,it is found that the model is effective and the training efficiency is higher.Using darknet53 as feature extraction network,aiming at the problem of feature loss in the detection process of feature maps of different scales of network,this thesis proposes a network fusing shallow information to train and extract text features.The TDS and ICDAR data set are used for experiments,it is found that the training efficiency is higher.(2)The YOLO?BOX multi angle text detection model based on YOLO is studied.In order to solve the problem that the horizontal rectangle is used to describe the positioning result of YOLO algorithm,which can not effectively detect the inclined text box,an improved YOLO?BOX algorithm is proposed,through the target candidate box prediction,clustering filtering and angle correction steps,makes the optimized algorithm suitable for multi angle text region detection in natural scenes.Using TDS data set and ICDAR data set training,and compared with the original algorithm and a variety of detection algorithms,it is found that it has better accuracy and recall.(3)A multi angle text detection system based on improved algorithm is studied.Proposing a complete text region detection process by using the three methods studied,using python to program each function module of the detection system.The system realizes the function of text area detection and text content output of input image directly.The system is better suitable for text content detection in many scenes,such as nameplate detection,license plate detection,street view detection and so on.
Keywords/Search Tags:Deep learning, Text detection, Convolutional neural network, VGG16, YOLO
PDF Full Text Request
Related items