| There are various types of laryngeal diseases,such as vocal fold nodule and vocal fold polyp with increasing incidence in recent years,which may affect the respiration and vocalization of human and even lead to cancer.Laryngeal endoscope is the primary clinical method to diagnose laryngeal diseases.With the development of laryngoscope image digitization and storability,the computer-aided diagnosis technology based on laryngeal images starts to be explored and implemented.The glottis segmentation is the primary step in automated analysis of laryngeal images.The glottis is the opening between the two symmetrical vocal folds of the larynx.The morphology of glottis reflects the state of the vocal fold and plays an important role in the diagnosis of laryngeal diseases.Accurate glottis segmentation is the basis for subsequent computer-aided diagnosis,however is challenging due to various shapes of glottis,low contrast with surrounding tissues,the existence of laryngeal diseases and so on.In this paper,a deep attention network based on U-Net(DA-Unet)with color normalization operation is proposed to achieve an end-to-end segmentation of the glottal area.The original images are first processed by color normalization to reduce the adverse effects of low contrast and large differences in colors between different images.The normalized images are then sent to the proposed network DA-Unet for accurate glottis segmentation.In this network,residual structure is incorporated to build a deep network and extract richer features.After extracting features,a dilated feature pyramid attention module is applied to enhance the semantic information of the glottal area to avoid the interference of other areas.Finally,in the process of up-sampling,the high-level features and the corresponding low-level features are fused through element-wise summation step by step to better restore the edge and location information of the glottis and achieve more accurate edge of glottiss.In order to test the proposed method,a laryngoscope image dataset including images from healthy subjects and pathologic subjects is established.The effectiveness of the color normalization operation,submodule in the network DA-Unet and the overall structure of theproposed method are evaluated by a series of comparative experiments on the self-built dataset.Its performance is evaluated by several reliable and popular evaluation metrics.Experiment results confirm the effectiveness of the proposed method.Compared with some classical segmentation networks,the proposed algorithm shows better performance in glottis segmentation.Finally,the proposed segmentation algorithm is applied to the detection of laryngeal diseases,and good results are achieved.Therefore,the proposed method in this subject has important application value in the development of computer-aided diagnosis system for laryngeal diseases. |