| Multi-label image classification is a technique to identify a group of labels in an image.It needs to mine the hidden data relationship in the image,different from single-label image classification.In recent years,there are two common methods to mine the hidden data relationship: One is to learn the semantic correlation between labels,and the other is to learn the spatial relationship of target entities in a picture.The typical way to learn the semantic relevance of labels using graph convolutional neural network is to construct the label co-occurrence relationship model,such as ML-GCN(Multi-Label Graph Convolutional Network).However,such methods transform the label conditional probability matrix into label co-occurence matrix by setting single threshold,and single threshold is difficult to accurately distinguish whether there is co-occurrence relationship between labels.Therefore,this article propose a multi-label image classification method based on multiple graph convolution,MGAN(Multiple Sub-Graph Attention Network).The thesis main works contents and innovations are as follows:1)The thesis proposes to set up multiple threshold to divide label conditional probability matrix.According to the different powerful semantic correlation,the conditional probability matrix is converted to multiple label co-occurence matrix.So that the error of the division of the conditional probability matrix by single threshold can be reduced.Then learn the semantic relevance of different label co-occurence matrix through multiple convolution neural network.2)The matrix product of the label representation matrix and feature map is taken as the attention fraction,and the labels semantic correlation learned by different graph convolution is integrated into the convolutional neural network by using the attention mechanism,so as to achieve the purpose of mining the hidden spatial data relationship in an image.Comparison experiments were performed on MS-COCO and PASCAL VOC datasets.And the experiments show that the MGAN model is better than the traditional ML-GCN model in m AP and F values through introducing the powerful semantic correlation and using the attention mechanism to learn the image regions spatial correlation. |