Font Size: a A A

The Fault Feature Extraction Method For Sample Imbalance

Posted on:2016-05-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:J WangFull Text:PDF
GTID:1318330482455786Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With the growing increase of complexity in modern industrial systems, the fault diagnostic methods based on data-driven have become a hot subject in current research. Different from classical approaches which require precise mathematical models, the data-driven approaches approximate the functions between system's inputs and states by using the process data to construct a'balck-box'model. And this makes the data-driven approaches more suitable for the complicated modern industrial systems.The fault diagnosis approaches based on data-driven continue to meet new challenges in the process from theory to practice, one of them is sample imbalances characteristics. Samples for diagnosis are the foundation of the fault diagnosis method. However, the most samples have no labels. Second, the relatively redundant information collected by sensors causes "curse of dimensionality". Third, the number of samples in every condition is imbalance, most of them are normal condition samples, a small part of them are fault condition samples. The above facts make the samples for dignosis show a serious imbalance characteristics (label imbalance, value imbalance and class imbalance).In order to deal with the above challenges, this paper discusses some fault feature extraction technologies. The main work is as follows:First, in the case of label imbalance, the semi-supervised principal component analysis (SSPCA) is discussed for fault diagnosis, which introduces the semi-supervised learning into the traditional principal component analysis (PCA). SSPCA effectively extracts the feature information from the polluted data by decreasing the influence of abnormal process samples and increasing the influence of normal process samples.Then, we analyzed the value imbalance, two fault feature extraction algorithms are proposed:locally preserving principal component analysis (LPPCA) and joint Fisher discriminant analysis (JFDA). In order to extract the features of normal samples fully, LPPCA incorporate the idea of locality preserving into the optimization goals of the PCA. When the original space is projected onto a low-dimensional space, the algorithm not only achieves the overall variance maximization, but also keeps the local neighborhood structure unchanged.JFDA is designed for classification. The algorithm is focusing on defining a novel robust criteria to overcome the affections of the non-Gauss and non-linear structures when calculating geometric centers. Simultaneously, in order to overcome the affection of outliers on the geometric center, the energy density of each datum is calculated. In order to further alleviate the affection of non-linear structure on the global and local data structure, kernel method is used to minimize the global and local data structure distortion, the affection of the non-linear structures is overcome.Finally, for class imbalance, ensemble manifold sensitive marginal fisher analysis (ESMFA) and imbalanced support vector data description-radius-recursive feature selection (ISVDD-radius-RFE) are proposed. ESMFA consists of three key components: (1) At the global level, the bagging-based ensemble model is used to overcome the overfitting caused by the data shift; (2) At the local level, the manifold-based oversampling named the weighted synthetic minority oversampling technique is proposed to solve the small samples problem in the minority class; (3) The sensitive margin fisher analysis is used to solve the challenge caused by the class overlapping.ISVDD-radius-RFE is a feature selection algothrim, which combines the supervised and unsupervised methods. When selecting features, the algothrim not only includes the discriminant information from fault samples but also considers the imbalance characteristics in fault detection. So, ISVDD-radius-RFE could describe the boundary of normal condition more precisely.
Keywords/Search Tags:data-driven, fault diagnosis, sample imbalance, feature extraction, feature selection, principal component analysis, Fisher discriminant analysis, semi-supervised learning
PDF Full Text Request
Related items