According to the survey,breast cancer has become a great threat to women’s health.In the process of medical diagnosis of breast ultrasound images,doctors’ medical experience has a significant impact on the diagnosis results.In addition,abundant ultrasound images are a huge workload for doctors.Focusing on the ultrasound images of breast,this research applies the deep learning technology to the classification and location of lesion areas in breast ultrasound images.This research can not only help to improve the accuracy of medical diagnosis,but also lighten the burden of doctors.At present,target detection algorithms are mainly divided into one-stage algorithm based on convolutional neural network,such as YOLO,and two-stage algorithms,such as Faster RCNN.The Transformer model originally used for natural language processing has been confirmed to achieve better performance in object detection,such as the DETR model.The research is based on the two-stage target detection algorithm Faster RCNN to identify lesions in breast ultrasound images.First,the ViT-Patch model is proposed to judge whether the divided breast ultrasound image patch contains the lesion area.In the experiment,the size of the image patch divided in the original vit algorithm is increased and the number of patches is reduced.The divided image patches processed by the multi-head attention mechanism are separately connected to different classification heads combined with the information of the Class Token.Different patch sizes are tried in the experiments,and a comparison experiment with the Res Net network is set up.The experimental results show that not only the information of the newly added Class Token in the output of the ViT-Patch model can be used to classify the entire image,but also outputs of other patches can be used to achieve the classification of each image patch and achieve ideal experimental results.Second,the target detection model Trans-Faster is proposed.Features extracted based on convolution neural network model have the advantage of inductive bias.And features extracted based on Transformer can establish connections between different parts of the ultrasound image.In the experiment,the breast ultrasound images are input into the ViT-Patch model and the feature extraction module of the Faster RCNN model in parallel,and the feature maps output by the two are fused.The experimental results show that the feature map obtained after fusion can achieve better detection performance.Thirdly,the target detection model Trans-Faster-Patch is proposed.The classification result output by the ViT-Patch model is combined with the process of determining whether the anchor area in the original image is the foreground area to assist the Region Proposal Network of Faster RCNN to generate more accurate proposal regions. |