Font Size: a A A

Fuzziness Based Instance Selection For Supervised And Semi-Supervised Learning

Posted on:2018-09-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:Rana Aamir Raza AshfaqFull Text:PDF
GTID:1318330536955916Subject:Computer Science
Abstract/Summary:PDF Full Text Request
To build a scalable(i.e.,in term of storage,speed,accuracy)and better learning model(or classifier)is one of the fundamental(but still challenging)classification problems that demands new knowledge discovery methods.Traits of data often make it difficult to construct a classifier that could achieve better predictive performance.The process of instance selection(IS)addresses some of the issues after selecting a subset of the instances to produce a better classifier with acceptable accuracy.This dissertation presents an effective and efficient methodology for instance selection that relies on fuzziness quantity of instances(i.e.,training or testing instances)to select an optimum subset of instances to achieve an acceptable solution to address this problem.One dimension of instance selection problem is addressed in a supervised learning paradigm;where the learning algorithm attempts to discover the relationship among the input features and their class labels using a set of training instances.Several instances participate in the training process,but some of those are unnecessary for classification,therefore;it is possible to obtain better generalization performance(of classifier)after excrete such unnecessary instances by using proposed fuzziness based instance selection methodology.To further address this problem,the relationship between the data points(that are closest to the boundary)and their fuzziness(i.e.,instance's output),where the core step is to explore the relationship between the generalization ability(i.e.,correct rate of classification)and fuzziness of the respective classifier is also presented.Another research problem is addressed for the semi-supervised learning(SSL)paradigm.In several real world applications,obtaining the ample amount of labeled instances is cumbersome,but unlabeled instances can easily be obtained.Therefore,a fuzziness category based divide-and-conquer strategy is introduced for promoting the classifier performance.This strategy relies on fuzziness quantity outputted by the base classifier for unlabeled(or unseen)instances.Experimental results on different well-known classification data sets show that the proposed methodology of instance selection improves the usefulness of a classifier.The newly obtained subset of instances effectively promotes the classifier's learning and the classification boundaries between the instances with different class values can easily be learned.In this dissertation,a problem of intrusion detection system(IDS)is also presented by using well known intrusion detection(ID)data sets,which confirms the beneficial outcomes of proposed fuzziness based instance selection methodology.In case of SSL,fuzziness is considered as an important criterion for the unlabeled instances,which effectively assigns the labels to unlabeled data using instance categorization for an IDS problem.An efficient IS(i.e.,as a preprocessing step)technique by using neural network with random weight(NNRw)is also presented for an IDS that helps to optimize the training set that can easily learn the classification boundaries between the instances of both normal and attack classes.The phenomena of inducting the fuzziness with the instances both in term of instance selection and for confident labeling of unlabeled instances helps the classifiers(i.e.,fuzzy classifiers)to minimize the false alarm and improves the detection rate.
Keywords/Search Tags:Fuzziness, Instance selection process, Semi-supervised learning, Intrusion detection, Neural network, Random weights, Divide-and-conquer strategy
PDF Full Text Request
Related items