Font Size: a A A

Research On Text Detection And Recognition Of Natural Scenes Based On Deep Learning

Posted on:2024-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:B LiFull Text:PDF
GTID:2558306920454974Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,computer vision has developed rapidly.As an important part of computer vision,the detection and recognition of natural scene text has also received extensive attention.The detection and recognition of natural scene text has a good application prospect in intelligent transportation,unmanned driving,auxiliary equipment for the blind,image content retrieval and other fields.Compared with traditional detection and recognition methods,the method based on deep learning has greatly improved performance and speed,and has strong research value.Because there are some influencing factors in natural scenes,such as uneven arrangement,different sizes,different colors and complex background,this paper studies the method of natural scene text detection and recognition based on deep learning,aiming at the problem of poor accuracy of existing algorithms:1.Aiming at the problems of complex processing and long time-consuming of existing detection algorithms,this paper proposes a text detection algorithm that integrates channel attention mechanism.In the feature extraction part,the feature pyramid structure is used to extract image features,and the improved channel attention module ECA-Net is introduced to extract deeper multi-layer network text features,enhance the original features and obtain attention weights,thus improving the detection ability of the algorithm.The experimental results prove the effectiveness of the text detection algorithm proposed in this paper,which integrates the channel attention mechanism.2.In order to solve the problem that the existing algorithms are not effective in detecting text in any direction,this paper proposes an FTA-RPN(Fit Text Area-RPN)network that adapts to the text area.By paying attention to the relationship between the anchor point after regression and the regression offset,a detection box that is more suitable for the text area is generated.In order to deal with the imbalance of samples in text detection,a new loss function is used to improve the model accuracy.The experimental results show that the FTA-RPN designed in this paper has obvious improvement in accuracy compared with the original RPN.3.Aiming at the problem of poor recognition accuracy of existing recognition algorithms,this paper proposes a recognition model DDL-Net(Double Decoding Layers-Network)with double decoding layers.The text of the image is corrected by the text correction module,and the stretched image is input into the recognition module for content recognition.Recognition network is a learning model based on attention mechanism,which can automatically capture the information flow in the input sequence,predict the output character-level language and deal with fuzzy text.Secondly,the bidirectional long-term and short-term memory network is introduced to predict and output the character sequence.By using the comprehensive classification loss function,the model can obtain higher recognition accuracy.The experimental results show that the redesigned text recognition model has improved the recognition accuracy and speed.
Keywords/Search Tags:scene text, deep learning, text detection, text recognition, attention mechanism
PDF Full Text Request
Related items