Font Size: a A A

Research On Machine Learning Methods To Reduce Labeling Cost

Posted on:2022-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:W T LiFull Text:PDF
GTID:2518306725493084Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Machine learning requires a large amount of labeled data,and the data usually needs to be manually labeled,which is time-consuming and expensive.How to reduce the labeling cost to train machine learning models that meet accuracy requirements is a hot research topic in the field of machine learning.By querying the most informative label,active learning attempts to obtain a good model with as few labels as possible,thus reducing the labeling cost.If the samples queried by active learning are manually labeled,the labeling cost will still be unacceptable.Crowdsourcing is a cheap way to collect labels,but the quality of labels is low due to the varying levels of workers,and expert validation needs to be used to further improve the quality of labels.This thesis mainly studies active learning and expert validation methods that can reduce the labeling cost,and the achievements are as follows:1.We propose an active learning method based on local and global information.This method estimates the local information of the sample based on the difference between the model outputs of the sample and its neighbors,and exploits the global information of samples by dividing the data into different clusters,and then realizes the dynamic allocation of the labeling budget according to the performance of the model on different clusters.Experimental results show that the method is effective in reducing the required labeled data to train the model.2.We propose an expert validation method based on collaborative labeling.This method combines label inference and model learning process to build a collaborative labeling framework,in which the label inference provides labeled data to the model and the predictions of the model are used to update the results of the label inference.Furthermore,expert validations are performed to verify the labels of samples where label inference and model prediction are inconsistent.The experimental results show that this method can reduce the number of labels that need to be validated by experts.
Keywords/Search Tags:crowdsourcing, deep learning, active learning, expert validation
PDF Full Text Request
Related items