Font Size: a A A

Lightweight Video Fire Detection Network With Spatial-Temporal Attention Optimization

Posted on:2024-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:W X ZhaoFull Text:PDF
GTID:2531307109970569Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Among the various disasters,fire is one of the major threats to public safety and social development.As fires can cause incalculable damage to human life,property and national economy,timely detection and response is often more important than post-event remediation,so fire detection has been a hot topic of research for scholars.With the development of image processing technology and the popularity of video surveillance,flame detection methods based on computer vision technology have been widely used.In order to solve the problem of blindness and lack of robustness of manual feature selection,some studies have used mainstream deep convolutional networks for target detection,such as YOLO and Faster RCNN,to achieve the localization and recognition of fire targets in images in recent years,however,these deep convolutional networks have complex structures and slow operation However,these deep convolutional networks suffer from complex structures and slow operation.In order to meet the task requirement of real-time fire detection,this paper investigates a lightweight video fire detection network based on spatial-temporal attention optimization,and improves the feature representation capability of the backbone network by introducing the Transformer module in the CNN backbone network to efficiently extract and fuse multi-scale local and global features of fire,while the lightweight backbone network ensures its detection speed.In this paper,we try to improve the accuracy of the network by combining the dynamic timing information of fire motion.The main research elements of this paper are as follows:(1)A new lightweight feature extraction backbone network is designed in this paper.(1)A new lightweight feature extraction backbone network is designed in this paper.The current classical network model suffers from redundancy in network structure and computation,while the classical lightweight model suffers from insufficient feature extraction and low detection accuracy.This paper constructs a new lightweight backbone network,introduces deep separable convolution,and redesigns the connection method of the network to improve the detection speed and detection accuracy of the network,reduce the number of parameters and operation cost of the network,and lay the foundation for subsequent feature processing.(2)This paper introduces the Transformer module to add spatial attention and improve the detection rate of targets with different resolutions.For ordinary CNN(Convolutional neural networks)networks are limited by the perceptual field size and cannot obtain the global field of view,which is not conducive to establishing the relationship between the current pixel and the edge pixels,thus reducing the accuracy of the network in identifying the foreground and background.In this paper,a Transformer mechanism is introduced in the backbone network,which is able to obtain distance-independent global information and distinguish between local information extraction and global information extraction when processing in parallel with CNN.At the same time,the features of different resolution layers are fused in a feature pyramid.This operation aims to integrate the feature information of different scales and complement each other,while improving the recognition performance of the backbone network for different size detection targets.(3)In this paper,a temporal frame processing sub-attention network is constructed to add dynamic temporal features of fire to the current frame to improve the detection rate of fire in low-quality video frames by the network.The temporal attention sub-module is used to simultaneously process the adjacent frames of the video,and to assist in the recognition of fire in these poor quality frames by adding features to the lower confidence video frames,in view of the lack of temporal information in the current convolutional neural network for fire detection.
Keywords/Search Tags:fire detection, attention mechanism, lightweight network, convolutional neural network
PDF Full Text Request
Related items