| A technique known as image caption uses an input image and produces an output that complies with a caption of visual semantic description.This technology combines two modes of image and language,and has a wide application prospect.Therefore,it is crucial to enhance algorithm performance and implement this technology’s application.This paper is divided into two parts: image caption algorithm integrating panoptic segmentation features and image caption algorithm deployment optimization.(1)Image caption algorithm integrating panoptic segmentation features.The Bottom-Up method based on target detection network is now the most popular feature extraction algorithm.Regional features still have a few issues,though: 1)Because the Bottom-Up algorithm extracts features at the object level,it is simple to overlook specifics;2)The accuracy of the target information can easily be impacted by the background information in the box because the algorithm extracts features based on the box.This paper suggests using a panoptic segmentation technique to extract fine-grained features and fusion of regional characteristics to compensate for its limitations in order to address this issue.The panoptic segmentation algorithm categorizes each pixel so that it can pay attention to the details and effectively distinguish between the backdrop and the target to prevent information overlap.Two fusion techniques are suggested: Fine-grained characteristics can be validated using the direct fusion approach,and these two features can be more effectively combined using the dual-branch fusion method.The MS COCO data set was the subject of numerous experiments in this paper.On the MS COCO line test set,the studies demonstrate that the fine-grained features are useful and that the dual-branch fusion method is superior to several other advanced technologies.The CIDEr index can reach a maximum of 134.3%,up 0.6% from the baseline.(2)Deployment optimization of image caption algorithm.Although image caption technology has advanced recently,there are still few products that use it on the market,and there are even fewer papers and references concerning its use.This paper will put this technology into use while also offering technical assistance for future studies.The implementation of image caption technology has the following two issues: 1)No effective deployment strategy exists;2)Model transition is challenging.Problems like operator mismatch and difficult to match complicated judgment structures will arise when the model is switched from the dynamic Py Torch framework to the static framework,making deployment more challenging.The three factors of deployment feasibility,accuracy,and calculating amount are thoroughly evaluated in this study in order to address these issues and produce a workable and effective solution.The model in the scheme was then successfully deployed on the Huawei Atlas200 DK platform,and the model deployment was optimized to reduce the time required for reasoning.In conclusion,this paper suggests using a panoptic segmentation algorithm to extract fine-grained features,and it suggests two fusion approaches to combine them with regional features,increase the precision of model description statements,and offer fresh research directions for future researchers.In addition,this paper suggests a workable and successful deployment strategy that has been successfully implemented on the Huawei Atlas200 DK platform to offer technical assistance for the ensuing integration of image description technology into real products. |