Font Size: a A A

Research On Multi-Instance Classification Based On Instance Filtering

Posted on:2019-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:F YiFull Text:PDF
GTID:2428330566487282Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,machine learning plays an increasingly important role in various scientific fields.Multi-instance learning evolved from traditional machine learning.In multi-instance learning,the training set consists of a number of packages with concept tags.Each package contains several instances without concept tags.If there is at least one positive instance in a package,the package is marked as positive.If all instances in a package are counterexamples,the package is marked as negative.By learning the training package,it is hoped that the learning system will predict the concept tag of packages outside the training set as accurately as possible.However,in practical applications,too sparse positive instances in a positive packet often lead to negative case-dominant classification results.This weakens the role of positive instances in classification,leading to a significant decrease in the accuracy of prediction results.In order to solve the above problems,the method proposed in this paper is to filter the instances in the positive packet,that is,to maximize the positive instances in the positive packet.Specifically,based on the different characteristics of the data set,we propose the following two methods:First,cluster-based multi-instance filtering classification.Data with the same tag may have similar attribute values(i.e.,similar in spatial distance),and they can be divided into corresponding clusters by clustering,so as to achieve separation of positive and negative instances.We propose a multi-instance filtering classification method based on K-means,by clustering instances of negative packets in a training sample,and then comparing the distance from the positive packet instance in the training sample to the center of the cluster,that is,the closer the distance is,the more likely it is Negative instances,to filter out negative instances in positive packets,to achieve the goal of improving the performance of multi-instance learning model.Second,multi-instance filtering classification based on KLIEP algorithm.The data with the same label has a similar distribution or similar probability density.In view of this situation,we propose a multi-instance filtering classification method based on KLEIP.First,assign weights to the instances in the positive packet in the training sample,and then learn the weight vectors by minimizing the differences in the distribution of instances in the positive and negative packets.The larger the instance's weight value,the more likely it is that this instance is a negative instance,so that we can filter out negative instances in the positive package so that the performance of the multi-instance learning model is improved.
Keywords/Search Tags:Multi-Instance, Bags, Instances Filtering, K-means, KLIEP Algorithm
PDF Full Text Request
Related items