Font Size: a A A

Research On Active Learning Algorithm Based On Extreme Learning Machine

Posted on:2024-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y GuFull Text:PDF
GTID:2568307157952989Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
Due to the rapid development of storage devices,networks,and compression technologies,various types of data are exhibiting an explosive growth trend.As a result,scenarios such as the following are becoming increasingly prevalent:acquiring large amounts of data instances is often easy,but labeling them is an extremely expensive and time-consuming process.To address this issue,active learning has emerged as an important machine learning paradigm.This thesis focuses on the design of the query strategy and proposes two different active learning query strategies for single-label learning and multi-label learning scenarios.For the single-label learning scenario,the existing query strategies tend to consider only the exploitation capability and ignore the exploration capability,which will lead to the cold-start problem and local optimum.To solve this problem,an improved algorithm called AL-SNN-ELM is proposed in this thesis.For multi-label learning scenarios,the design difficulties of query strategies are as follows:1)more expensive labeling cost compared to single-label learning scenarios;2)difficulty in measuring the uniform amount of instances information over all labels;and 3)the widespread existence of class imbalance problem in multi-label scenarios.To address several problems mentioned above,this thesis proposes a multi-label active learning query strategy called PLVI-CE.Specifically,the research contents and innovations of this thesis cover two main points as follows.1.An active learning algorithm with considering exploration and exploitation simultaneously(AL-SNN-ELM)AL-SNN-ELM contains two sequential query sub-procedures:exploration sub-procedure,and exploitation sub-procedure.In the exploration strategy,the shared nearest neighbor clustering algorithm(SNN)takes charge of exploring data spatial distribution to query representative instances.The exploitation strategy is responsible for transforming the actual output of extreme learning machine(ELM)as posterior probabilities to query uncertain instances.In other words,the exploration sub-procedure helps roughly locate the decision boundary for sound classification by observing the global distribution of the data,while the exploitation sub-procedure subtly tunes this decision boundary by observing the decision of local instances surrounding it.In addition,in order to reduce the time complexity of active learning,an online-sequential extreme learning machine(OS-ELM)is adopted in this thesis to replace the traditional ELM.During the training process of active learning,instead of retraining the classification model after each iteration,only the results of the previous iteration need to be updated.2.A multi-label active learning algorithm that considers both uncertainty and diversity(PLVI-CE)This thesis designs a multi-label active learning query strategy called PLVI-CE,which considers both uncertainty and diversity measures.In particular,uncertainty is measured by the inconsistency between two predicted label vectors from the same unlabeled instance,while the diversity of each unlabeled instance is measured by the average difference between its posterior probability and that of all labeled instances.In addition,to cope with the impact of the class imbalance problem in the multi-label learning scenario,this thesis attempts to use a label-weighted extreme learning machine(LW-ELM)as the base classification model in a multi-label active learning framework.This is because the following advantages of LW-ELM are considered:1)lower computational cost,2)stronger generalization performance,and 3)the ability to be directly applied to multi-label data with class imbalance problems so that the designed active learning query strategy can provide approximately unbiased queries in the multi-label scenario.Regarding the AL-SNN-ELM algorithm,this thesis conducts extensive experiments on22 UCI benchmark datasets and two real-world datasets,and its experimental results show that the proposed AL-SNN-ELM algorithm has significant performance improvement compared with AL-ELM.Specifically,on the aggregation dataset,the ALC value of the proposed algorithm increases by 3.5%compared to the AL-ELM algorithm.This verifies that it is necessary to consider both exploration and exploitation capabilities for the design of active learning query strategies.For the PLVI-CE algorithm,this thesis conducts extensive experiments on 12 benchmark multi-label datasets,and its experimental results show the effectiveness and superiority of the proposed PLVI-CE algorithm compared with several state-of-the-art multi-label active learning algorithms.Specifically,the proposed algorithm outperforms the advanced comparative algorithm AUDI on the flags dataset with a 2.61%improvement in Micro-1score,a 2.14%improvement in Macro-1score,and a2.58%improvement in Hamming loss.
Keywords/Search Tags:Active learning, Multi-label active learning, Extreme learning machine, Shared nearest neighbor clustering, Class imbalance
PDF Full Text Request
Related items