Images play an indispensable role in multimedia technology and provide a direct reflection of the real world from a visual perspective.Images can convey various information,such as themes,emotions,and knowledge,through visual means.Utilizing image data has become an important topic in the era of big data.Multi-label image classification is a crucial research area that aims to distinguish multiple objects or concepts within an image.Unlike single-label image classification,multi-label image classification involves more complex object relationships,making it a challenging task.In real-life scenarios,due to reasons such as a large volume of image data and difficulties in acquiring accurate annotations,image annotation information is often incomplete,leading to the existence of weakly labeled data.The utilization of weakly labeled data to train a classification model and achieve excellent classification performance has become a hot research direction in the fields of machine learning and deep learning.Currently,common methods for multi-label image classification assume training on fully labeled sample sets,while existing methods are unable to directly handle weakly labeled data.Therefore,the focus of current research in the image field is how to recover missing label information to facilitate model training and prediction.This paper conducts research on multi-label image classification methods for weakly labeled data,where incomplete label information primarily refers to the presence of partially labeled data and completely unlabeled data within the dataset.Building upon previous studies,this paper further explores the topic and demonstrates the feasibility and rationality of the proposed method through experiments on publicly available multi-label image datasets.The main contributions of this paper are as follows:(1)The Weak Label Synchronous Learning(WLSL)algorithm is proposed.This algorithm integrates label recovery and classification tasks to simultaneously accomplish both tasks,thereby improving the efficiency of the algorithm and reducing time consumption without sacrificing accuracy.Firstly,a preliminary training is conducted using a small amount of fully labeled training data to obtain an initial model.Then,the weakly labeled training data is used as the validation set to optimize the model while simultaneously recovering the missing label information in the training set.This method effectively utilizes weakly labeled data to train the model and improve its classification performance on weakly labeled data.Furthermore,by establishing label correlations,missing label information can be recovered,further enhancing the model’s performance.In the experimental process,the proposed algorithm is compared with other representative methods in terms of label recovery accuracy and classification performance.The experimental results demonstrate that the proposed algorithm exhibits outstanding performance,with all evaluation metrics surpassing those of the compared methods.(2)The Semi-supervised Weak Label Synchronous Learning(SWLSL)algorithm is proposed.This method improves upon the WLSL algorithm by addressing the label recovery issue of completely unlabeled samples through the inclusion of a label propagation module.Firstly,the algorithm utilizes the feature information of labeled and unlabeled data and employs label correlation constraints to propagate the label information from labeled to unlabeled data.Then,a classification model is trained using the reconstructed training set.The unlabeled samples serve as the validation set for optimizing the classification model,enabling it to make predictions on unlabeled samples.Finally,accurate label recovery for missing labeled samples is performed,resulting in a semi-supervised classification model.Experimental results demonstrate that compared to other related methods,the proposed SWLSL algorithm improves label recovery accuracy and classification accuracy by 2percentage points.(3)In the SWLSL algorithm,this paper proposes an improved label propagation(ILP)method.This method calculates the similarity between samples to determine the weight values for label propagation among samples,constructing a fully connected relationship graph.Additionally,it computes the label conditional probability as a constraint on extreme cases of label occurrence.The process is iterated until label recovery is complete.In ablation experiments of ILP,after multiple iterations under different label missing rates,the label recovery accuracy exceeds 80%.This paper conducts experiments on multiple multi-label image datasets with weak labels,validating the effectiveness of the two proposed algorithms.The experimental results demonstrate that for incomplete labeled multi-label image classification tasks,the proposed algorithms have strong guiding significance and practical value. |