Object detection is a fundamental problem in computer vision with many important applications in surveillance,autonomous driving,medical analysis.Different objects present different sizes in the image,so it is necessary to use features of different scales to detect objects of different sizes.At present,the feature pyramid network can effectively solve this problem.However,the feature pyramid network is limited by single-layer detection and its own design defects.The feature extracted by the network lacks global context information and channel information.There are differences in scale,semantics and resolution among different levels of the feature pyramid,which lead to unbalanced quality of bounding box.Moreover,each layer of the feature pyramid network detects the target separately and calculates the regression loss as a whole,which fails to optimize the regression loss function of each layer collaboratively.To address these issues,the main contributions of this paper are as follows:(1)To solve the issue that it is difficult to perceive the global context information and channel information loss in feature pyramid network,a new feature pyramid network based on multi-layer attention mechanism is proposed.By using self-attention mechanism and channel attention mechanism,Transformer feature pyramid module and channel attention module are designed respectively.By modeling the correlation of features at different levels of the feature pyramid network,the model can learn the global context information,and use learnable network parameters to infer the importance of different channel features from the original features and constructed features,so as to enhance more important channel features.It improves the representation capability and utilization efficiency of the features,improves the accuracy and efficiency of target detection,and provides a foundation for the further development of target detection technology.(2)To solve unbalanced quality of bounding box and lack of cooperative optimization of the regression loss function in each layer of pyramid network,an adaptive regression loss function is proposed,and the reweighting method is adopted to give different weight to the loss function adaptively,so that the model focuses on the training process of high-quality bounding box.The loss function retains the advantages of the original GIoU loss,it adaptively enhances the difference of bounding box to the regression loss function,and it dynamically adjust the contribution of the loss function to the model.The convergence speed and the accuracy of the model are improved.(3)The above researches are applied to the scene of fatigue detection.And then,a fatigue detection system based on the FCOS neural network and PyTorch framework is developed.The model is trained on a self-made dataset.The system can solve the problem of insufficient robustness of traditional detection algorithm in terms of head posture and face occlusion,and can effectively and accurately identify the fatigue state of human body in pictures,videos and camera data. |