Research On Multi-Instance Learning Based On Instance Weighted Support Vector Machine

Posted on:2017-05-24

Degree:Master

Type:Thesis

Country:China

Candidate:L Y Zhang

Full Text:PDF

GTID:2308330485469640

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet and the explosion of information, the amount of data that people confront is growing rapidly. Underneath this rich data resources, there exists important, potential and useful knowledge. However, there may be errors in the real-world data, e.g., the missing of attribute values, inappropriate data form, duplicate and abnormal data, incomplete data, noisy data and inconsistent data. Therefore, how to get the effective information from these data is the problem that we still need to solve.Multiple-instance learning is proposed to solve the problem of drug activity prediction initially. The purpose of multiple-instance learning is to extract the active molecules from a large number of molecules, which can help the pharmaceutical companies to put their limited resources into more meaningful research. At present, a lot of classic multiple-instance learning algorithms have been proposed, such as multiple-instance learning based on support vector machine algorithms, multiple-instance learning based on neural network algorithms, multiple-instance learning based on decision tree algorithms, and so on. Multiple-instance learning has been widely studied and applied in many fields, such as drug activity prediction, image retrieval, stock prediction, object detection, and so on.In traditional multiple-instance learning, the training set consists of a number of bags, and each bag contains a number of instances. The label is associated with a bag, and the instances have no labels. Based on the description of multiple-instance learning, each positive bag contains at least one positive instance. That is to say, the positive bag may contain negative instances, besides positive instances. Furthermore, all instances in negative bags are negative. The aim of multiple-instance learning is to train a multiple-instance learning classifier on the labeled bags and use the classifier to predict unknown bags.This thesis introduces the basis idea of support vector machine (SVM) and gives a novel multiple-instance learning method which combines multiple-instance learning and SVM, and reduces the impact of noises on the classification results when the data is corrupted by the noises. Due to the impression of data acquisition equipment or transmission errors, the instances in bags may contain noisy information. In this case, the traditional multiple-instance learning methods may not deal with the noisy information in the bags. To solve this problem, we combine multiple-instance-learning and SVM by weighting the instances in bags which can reduce the impact of noises on building the classifier. In the experiments, we use the musk molecule data set and the image retrieval data set to test the effectiveness of the proposed method, and compare a variety of classical multiple-instance learning algorithms. The experimental results show that the proposed method outperforms the methods compared.

Keywords/Search Tags:

multiple-instance learning, support vector machine, noisy information

PDF Full Text Request

Related items

1	Research On Multi-instance Learning Based On Support Vector Machine
2	Single And Multiple Instance Lerning Based On Support Vector Data Description
3	Research On Some Problesm Of Support Vector Machine Learing Algorithm
4	Research And Application For Face Recognition On Image Retrieval Method Based On Multi-instance Learning
5	Support Vector Machine Learning Under Noisy And Overlapping Data
6	Research And Application On Multi-Instance Learning Using Support Vector Machine
7	Research On Multiple Birth Support Vector Machines Models And Algorithms Based On Structural Information
8	A Study On Algorithm For Multi-instance Learning Based On Support Vector Data Description
9	Research On Multiple Respects Of Support Vector Machine
10	Research On Semi-Supervised Support Vector Machine Learning Methods