| In recent years,multi-label classification technology has developed rapidly,but it also faces many difficulties and challenges.First,the output space of multi-label classification will exponentially expand as the number of labels increases.Most researchers try to tap the relationship between labels to solve this problem.Therefore,how to effectively mine the relationship between labels has become a research topic.In addition,unbalanced label categories in multi-label data sets also pose challenges for multi-label classification problems In response to the above issues,the main tasks of this paper are(1)In order to solve the problem that classic multi-label classification algorithms can not fully exploit the relationship between labels from multi-label datasets,LDA-ML,BTM-ML and WNTM-ML based on LDA,BTM and WNTM are proposed respectively.The three frameworks establish an implicit topic layer outside the labels to mine the relationship between labels,and add the topics with labels’ relationship to the features to improve the classification ability of the classic multi-label classification algorithm.In addition,the word frequency information of labels is used to enhance the role of the key labels and improve the result of mining the relationship between labels after modeling(2)Aiming at the unbalanced application scenarios of the labels in the multi-label data sets,a multi-label classification algorithm FAL based on the supervised topic model was proposed.The algorithm establishes the relationship between features and labels by supervised topic model,and updates the Dirichlet prior to model by using the word frequency information of the features and the number of labels corresponding to each sample,so that it is more in line with the prior distribution of features and the label distribution of the instance.This will ultimately improve the classification result of algorithm.(3)In view of the large number of labels in the multi-label data set and the complex application of the relationship between labels,a multi-label classification algorithm FNAL based on the supervised topic model was proposed.This algorithm introduces a model with WNTM which have a good effect on modeling short texts to build model between labels and implicit themes during the training phase,and updates the Dirichlet prior of label distribution of the instances to be predicted during the prediction phase through the sampling information.the more accurate priori information of the label distribution for the label instance to be predicted can be obtained.This will improve the classification ability for labels.(4)Aiming at the actual application scenario of diagnosis of Parkinson’s disease in traditional Chinese medicine,a solution is proposed in combination with the framework and algorithm proposed in the previous section.The dataset used was converted from the Parkinson’s Disease Scale provided by Nanjing Brain Hospital.The TCM scale collects the patient’s symptoms by means of dialectical methods.Each patient corresponds to one main TCM syndrome or one main TCM syndrome and one secondary TCM syndrome at the same time.In this paper,the diagnosis of Parkinson’s disease in TCM is transformed into a multi-label classification problem by using the symptoms as features and the syndromes as labels.Then try to use the proposed multi-label classification framework and multi-label classification algorithm to solve the multi-label classification problem after conversion.Experiments show that the method proposed in this paper has a good classification effect on the Parkinson data set obtained from modeling of practical application scenarios. |