Font Size: a A A

Research On Multi-Instance Learning Based On Instance Weighted Support Vector Machine

Posted on:2017-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:L Y ZhangFull Text:PDF
GTID:2308330485469640Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and the explosion of information, the amount of data that people confront is growing rapidly. Underneath this rich data resources, there exists important, potential and useful knowledge. However, there may be errors in the real-world data, e.g., the missing of attribute values, inappropriate data form, duplicate and abnormal data, incomplete data, noisy data and inconsistent data. Therefore, how to get the effective information from these data is the problem that we still need to solve.Multiple-instance learning is proposed to solve the problem of drug activity prediction initially. The purpose of multiple-instance learning is to extract the active molecules from a large number of molecules, which can help the pharmaceutical companies to put their limited resources into more meaningful research. At present, a lot of classic multiple-instance learning algorithms have been proposed, such as multiple-instance learning based on support vector machine algorithms, multiple-instance learning based on neural network algorithms, multiple-instance learning based on decision tree algorithms, and so on. Multiple-instance learning has been widely studied and applied in many fields, such as drug activity prediction, image retrieval, stock prediction, object detection, and so on.In traditional multiple-instance learning, the training set consists of a number of bags, and each bag contains a number of instances. The label is associated with a bag, and the instances have no labels. Based on the description of multiple-instance learning, each positive bag contains at least one positive instance. That is to say, the positive bag may contain negative instances, besides positive instances. Furthermore, all instances in negative bags are negative. The aim of multiple-instance learning is to train a multiple-instance learning classifier on the labeled bags and use the classifier to predict unknown bags.This thesis introduces the basis idea of support vector machine (SVM) and gives a novel multiple-instance learning method which combines multiple-instance learning and SVM, and reduces the impact of noises on the classification results when the data is corrupted by the noises. Due to the impression of data acquisition equipment or transmission errors, the instances in bags may contain noisy information. In this case, the traditional multiple-instance learning methods may not deal with the noisy information in the bags. To solve this problem, we combine multiple-instance-learning and SVM by weighting the instances in bags which can reduce the impact of noises on building the classifier. In the experiments, we use the musk molecule data set and the image retrieval data set to test the effectiveness of the proposed method, and compare a variety of classical multiple-instance learning algorithms. The experimental results show that the proposed method outperforms the methods compared.
Keywords/Search Tags:multiple-instance learning, support vector machine, noisy information
PDF Full Text Request
Related items