Font Size: a A A

Multi-label Classification Based On Double-layer And Optimal Selection

Posted on:2016-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:G Q LiuFull Text:PDF
GTID:2308330461992250Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Classification is a popular branch of data mining techniques, it is aimed at training sample data set to construct a classification model, and use classification model to the measured data to predict the category information. In traditional single-label classification, a sample instance belong to only one category.However, There are a lot of ambiguities examples in the real world, namely a sample instance might also belong to two different categories, the corresponding classification problems referred to as multi-label classification.Initially, multi-label learning originates from the document classified in the ambiguity problem.After decades of development, multi-label classification technology has been widely applied in medical diagnosis, biogenetics, recommender systems, information retrieval, image, video, and other fields.With the increasing attention, algorithms for multi-label classification are constantly emerging. Whereas, many of them such as BC, CC and MBR, exist several issuses: the correlation amony labels are ignored; the label sequences are selected randomly; the label information interacte redundantly and the information is lost in the process of the interaion. All these issues cause the decrease of the precision. Especially, it is most significant when multi-label classification is transformed into one or more binary classification. Therefore, the paper proposed a chain-base mult-label classification algorithm based on two-layer structure(DLMC) firstly. DLMC constructs a bilayer structure to build the association of labels through the interaction of inter and intra layers.The first layer achieved by a typical binary classification model conducts the first classification and exchanges labels’ messages with the lower as well. The second layer is a dynamically updated chain-base classifier. It passes and renews the information of labels to achive secondary interaction. Then a maximum weight of labels spanning(MWTOS) algorithm is proposed to seek the optical order of the labels. The MWTOS alsogithm could solve the problem of decreasing of classification precision. The algorithm may solve the problem of decreasing of classification accuracy which is coused by randomly selection for the label sequences. Finaly, a multi-label classification with optimal sequence based on double layes(DLMOS) is proposed. The algorithm congests binary independence model, chain-classification model and nested stack model with optimization procedure.DLMOS owns not only the advantages of DLMC and MWTOS, but also resolve the redundancy or deficiency of label interaction information. In the meantime, simulated datasets with strong relevance are also constructed to verify the ability of DLMOS for dealing with the correlation of labels. Experiments on benchmark datasets and simulated dataets are performed to validate the effectiveness of proposed approach comparing with other well-established methods.
Keywords/Search Tags:Optimal sequence, Double layer, Multi-label classification, Problem transformation, Information interaction
PDF Full Text Request
Related items