Electric submersible pump is a widely used artificial lift equipment.Due to the complex structure and harsh working environment,failures frequently occur.The data quality of ESP obtained in oil production is poor,lacks labeled data,and cannot meet the modeling needs of traditional machine learning algorithms.Therefore,there is a need for a fault diagnosis method that is used based on the existing data conditions of the electric submersible pump and has a higher accuracy rate.In this thesis,based on a small amount of labeled data and a large amount of unlabeled data,a data-driven fault diagnosis method for ESP is carried out.Aiming at the ESP equipment in the development and production process of the Bohai oil and gas field,combined with the fault characteristics and data conditions of the ESP,this thesis proposes an ESP fault diagnosis method based on semi-supervised learning.Based on the composition and production environment of the ESP,the failure mechanism and parameter changes of the ESP are analyzed in detail.Combined with the characteristics of ESP data,three preprocessing methods of ESP data are proposed,including data cleaning,data standardization and feature engineering,which provide theoretical basis and data basis for establishing fault diagnosis models based on ESP data.Use the processed data to establish a semi-supervised fault diagnosis model based on a self-training algorithm,select an appropriate algorithm as a base classifier through performance comparison,and use a small amount of labeled data as a training set to establish an initial classification model.High-confidence unlabeled data is added to the training set to expand the scale of training samples,and the high-confidence samples are applied to other fault models by cross-labeling to reduce the risk of accuracy drop due to poor performance of the initial classification model.Then use the processed data to establish a fault diagnosis model based on the semi-supervised support vector machine algorithm(S3VM),use labeled data and unlabeled data for modeling,search for a classification hyperplane that maximizes the spacing between different categories in space,and use unsupervised pre-aggregation.The class method builds a training set with balanced samples to avoid overfitting of the model.In order to further improve the accuracy of the fault diagnosis model of the semi-supervised learning algorithm,based on the principle of the ensemble learning algorithm,this thesis adopts two integrated learning frameworks,Bagging and Stacking,to combine the two semi-supervised learning fault diagnosis models into a comprehensive semi-supervised learning method.failure method.Finally,the semi-supervised ensemble learning model is used to test the fault well data of the electric submersible pump,which verifies that the method can obtain better fault classification accuracy and can significantly improve the classification performance of semi-supervised learning.The research results provide some theoretical support and methods for the research field of ESP fault diagnosis. |