Font Size: a A A

Multi-label Classification Algorithm Based On Hierarchical Random Forest

Posted on:2018-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:S H LinFull Text:PDF
GTID:2348330533966334Subject:Engineering
Abstract/Summary:PDF Full Text Request
The problem of multi-label classification is widely used in image recognition,text classification,medical diagnosis,gene analysis,information retrieval,personalized recommendation and so on.It has been widely used in recent years and has been paid more and more attention.The data in the multi-label classification problem usually corresponds to a set of multiple labels,and the different labels are interdependent,which reflects the semantic meaning of the samples.At present,the strategy to solve multi-label classification problem is the problem transformation and algorithm adaptation.Although many algorithms have been proposed to solve the problem of multi-label classification,most of them have some shortcomings,such as not considering the relationship between labels,such as the mutual relation between tags,and the problem of multi-label classification,the algorithm accuracy is low,the algorithm complexity is high and so on.In this paper,random forest is extended to the multi-label classification field.Combined with the characteristics of multi-label classification problem,two new algorithms are proposed to solve the multi-label classification problem.(1)We propose a multi-label classification algorithm Multi-label Extremely Randomized Forest.It improves the extremely randomized tree,and adopts a new label reuse mechanism to capture label dependency of multi-label learning dataset.The label associated with the parent node can be reused for the associated child nodes and combined with the characteristics of random forest to improve classification performance.(2)We propose a multi-label classification algorithm Multi-label Hierarchy Randomized Forest.The algorithm uses a divide-and-conquer strategy to divide the large label set into small label sets and solve the coupling problem of the labels by clustering method.Combined with the random forest algorithm,it has not easy to over fit,capacity and other advantages.Finally,we prove that our algorithm can solve the multi-label classification problem.
Keywords/Search Tags:Multi-label classification, Random forest, Label association
PDF Full Text Request
Related items