Semi-Supervised Deep Fuzzy C-Mean Clustering And Classification

Posted on:2020-06-30

Degree:Doctor

Type:Dissertation

Institution:University

Candidate:Ali Arshad

Full Text:PDF

GTID:1368330602963875

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Semi-supervised learning has been successfully applied in research fields such as data mining and machine learning for dynamic data analysis.Imbalance class learning is one of the most challenging issues for classification.Other than imbalance dataset the redundant and irrelevant features that are used to train model can hinder the performance of a classification model.In recent years,the core focus of numerous researchers has been on binary and multi-class imbalanced classification.The key points of our study are as follows.1.A semi-supervised Deep Fuzzy C-Mean(DFCM)clustering is proposed for binary and multi-class imbalanced data classification,it can be applied when data boundaries are not clearly defined and extra parameters are needed to reduce the statistical closeness.The proposed approach is basically a two-stage data pre-processing technique for the classification model.2.A novel semi-supervised Deep Fuzzy C-Mean(DFCM)clustering based feature extraction technique is proposed to create new features by utilizing Deep multi-clusters of supervised and unsupervised datasets that tend to maximize intra-cluster classes and intra-cluster features by using Fuzzy C-Mean(FCM)clustering.We also apply decomposition strategy on multi-clusters to extend our approach for the multi-class imbalanced dataset,which associates one-vs.-one the maximum similarity between intra-cluster classes and intra-cluster features.Fuzzy C-Mean is utilized to handle the overlapping problem.3.A feature selection method is proposed combined with feature selection and re-sampling(random-under sampling and random-over sampling)to reduce the noisy data with handle the imbalance problem for classification.Finally,the classification model is predicated on the maximum homogenous between the features of labeled and unlabeled data.However,by the performance of model results in the amalgamation of novel DFCM data pre-processing approach work better due to their ability to identify and amalgamation essential information in data features.4.To check the efficiency of our proposed approach,we choose datasets from real-world software projects(NASA&Eclipse)for binary-class imbalanced data classification and 18 UCI benchmark datasets for multi-class imbalanced data classification,and then we compared our approach with state-of-the-art binary-class and multi-class imbalance learning algorithms by using performance measure(Pd,Accuracy,F-Measure,and Area Under the Curve(AUC))and further investigated the influencing factors in our approach.The result shows that the performance of the proposed Deep Fuzzy C-Mean(DFCM)technique is stable and effectiveness on all types of binary and multi-class imbalanced datasets.

Keywords/Search Tags:

Semi-supervised Learning, Deep Fuzzy C-Mean clustering, Feature Learning, Imbalanced dataset, Multi-class Classification

PDF Full Text Request

Related items

1	Research On Violation Data Identification Technology Based On Deep Learning
2	Research On Imbalanced Dataset Classification In Semi-supervised Learning
3	Reliable Semi-supervised Learning For Evolving Data Stream
4	A Study On Some Problems Of Semi-supervised Learning
5	Fault Classification Based On Modified Active Learning And Semi-Supervised Learning
6	Research On Semi-Supervised Classification Based On Local Learning
7	Research On Sentiment Classification Based-upon Imbalanced Data
8	A Research On Imbalanced Learning Based On Semi-supervised SVM
9	Research On Semi-supervised Clustering And Classification Algorithm
10	Selection And Classification Of Unbalanced Data Based On Semi - Supervised And Integrated Learning