Research On Scene Text Detection Algorithm Based On Improved Feature Pyramid Network And Feature Enhancement Fusion

Posted on:2024-06-16

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Feng

Full Text:PDF

GTID:2568307100988689

Subject:Computer Science and Technology

Abstract/Summary:

Text detection in natural scenes refers to the technology of automatically identifying and locating text regions in natural scene images or videos.With the advancement of deep learning,this technology has been widely applied in various fields such as traffic supervision,intelligent driving,image retrieval,and classification.However,text regions in scene images exhibit diversity and complexity,including factors such as fonts,occlusions,lighting variations,and often appear in complex backgrounds,which pose challenges for detection.Current text detection models typically use image classification networks as backbone networks.However,redesigning and pretraining new feature extraction networks require significant computational resources due to the different requirements of image classification and obj ect detection tasks.To improve efficiency,this study adopts an improved composite backbone network that directly combines pretrained networks and introduces a dynamic gate mechanism to reduce redundant information transmission and enhance the feature extraction capability of the backbone network.The current text detection algorithms are increasingly inclined towards using highresolution images as inputs because they can provide richer semantic features.However,this also requires the models to have a larger receptive field.Therefore,this paper proposes a dual-branch attention-guided feature pyramid method that aims to expand the receptive field through a dual-branch feature fusion module,achieving the fusion of coarse and fine-grained features.At the same time,an attention-guided module is introduced to enhance the semantic information of the features and reduce the disruption of text boundary information caused by dilated convolutions.To address the issue of information loss in the fusion of multi-scale feature maps,this paper proposes the Feature Enhancement Fusion Module(FEFM).By employing attention mechanisms at the feature level,spatial positions,and output channels,the network’s perception capability is enhanced,effectively utilizing multi-scale features while avoiding information sparsity and loss.Finally,the ablation experiment was carried out on the public dataset ICDAR2015 to prove the effectiveness of each module proposed in thi s paper,and the experimental comparison with the current mainstream scene text detection algorithm on the dataset ICDAR2015,ICDAR2017,MSRA-TD500 and the experimental results were displayed.Compared to the current mainstream algorithms,the p recision P,recall R,and F-measure of this algorithm have been improved to a certain extent.And the recall rate R has been significantly improved,which greatly reduces the phenomenon of missed text detection.The F-measure of the algorithm in this paper reaches85.8%,74.5%,and 84.8% on ICDAR2015,ICDAR2017 and MSRA-TD500.

Keywords/Search Tags:

Text detection, Feature pyramid, Attention mechanism, Feature fusion

Related items

1	Scene Text Detection Based On Feature Fusion And Pyramid Attention
2	Research On Object Detection Algorithm Based On Feature Pyramid Fusion And Attention Mechanism
3	Efficient And Lightweight Feature Pyramid Network For Object Detection
4	Target Detection Algorithm Based On Feature Pyramid Structure
5	Research On Chinese Sign Language Detection And Recognition Based On Feature Fusion
6	Research On Lightweight Dehazing Algorithm And Model Optimization Method Based On Feature Pyramid Network
7	Research On OCR Detection And Recognition Technology Based On Deep Learning
8	Research On Target Detection Algorithm Based On Bidirectional Feature Fusion
9	Video Object Detection Based On Attention Mechanism And Multi-Scale Feature Fusion Convolutional Network
10	Research On Attention Mechanism And Multi-scale Feature Fusion Method For Object Detectio