Font Size: a A A

Distributed Parallel Modeling And Application For Industrial Processes With Large-Scale Data

Posted on:2020-11-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:L YaoFull Text:PDF
GTID:1368330572483001Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the"Industry 4.0" and "Industrial Internet",modern industry faces new opportunities and challenges.The automation system has become more and more perfect,accelerating the speed of modern industrial informationization and intelligence,and steadily marching into the era of industrial big data.High-dimensional,multi-case,multi-cell large-scale data not only brings more valuable information to data-driven modeling,but also challenges traditional single-machine data modeling methods.How to fully and efficiently mine the high-value information contained in large-scale data through the industrial Internet platform and how to use them to solve the problems of actual industrial processes are the hotspot of current process modeling research.Based on the large-scale industrial process data,this paper studies the distributed parallel modeling method under different data characteristics and process characteristics with distributed parallel computing architecture,which is used for quality prediction and process monitoring of industrial processes.The main research contents of the full text are as follows:(1)Aiming at the problem of large-scale data modeling in industrial process,a distributed parallel modeling framework based on MapReduce is proposed for the prediction of key quality variables.The semi-supervised probabilistic principal component regression model is deployed in the framework,and the local model is trained in parallel on large distributed data blocks.Then the MapReduce-based Bayesian fusion algorithm is used to integrate the quality prediction results of each local model.Compared with the traditional single machine modeling algorithm,the distributed parallel semi-supervised probability principal component regression model based on MapReduce has higher computational efficiency in the face of large-scale data.Since the model training process uses more data,the prediction accuracy has also improved significantly(2)Aiming at the nonlinear and semi-supervised characteristics of industrial process data,a semi-supervised deep learning model based on hierarchical extreme learning machine is proposed.The deep network structure of the auto-encoder can effectively extract the nonlinear characteristics of the data,and the manifold regularization method can help construct the semi-supervised learning model.This method not only can deeply mine labeled data information,but also extract additional useful information from large-scale unlabeled samples Furthermore,according to the problem of multi-case process modeling under large-scale industrial data,based on the strategy of "divide and conquer",a distributed parallel extreme learning machine and hierarchical extreme learning machine based on MapReduce are proposed.Firstly,the distributed parallel K-means algorithm is used to divide the multiple operating conditions of the process,and then the distributed parallel hierarchical learning machine is used to train the local model in different working conditions.Finally,the Bayesian model fusion algorithm is used to integrate the local parts,which realizes online prediction of quality variables.(3)Aiming at the problem of large-scale industrial process data including random noise and uncertainty,a distributed parallel probabilistic modeling framework based on parameter server architecture is proposed.Under this framework,the stochastic variational inference algorithm is utilized to transform the probabilistic model based on variational inference into a form of scalable stochastic optimization,and then conducted the distributed parallel deployment according to the parameter server computing architecture.This paper proposes a distributed parallel Gaussian mixture model for multi-case process modeling under large-scale data.In each iteration of the distributed parallel Gaussian mixture model training process,only one or a small batch of samples from a large data set is randomly selected to calculate the gradient and update parameters,which greatly improves the training efficiency of the model.The scalable form of the data set enables the distributed parallel model based on the parameter server architecture to easily handle large data sets(4)A semi-supervised Gaussian mixture model based on variational inference is proposed for the quality prediction of semi-supervised data in multimode processed.In order to make full use of large-scale unlabeled samples,a semi-supervised Gaussian mixture model based on stochastic variational inference is proposed and deployed into a distributed parallel semi-supervised Gaussian mixture model,which significantly improves the efficiency of model training.A large number of unlabeled samples participated in model training,which makes the parameter solution more accurate and improves the performance of the model for quality prediction.Furthermore,for the multi-unit and multi-mode characteristics of large-scale plant-level processes,a hierarchical process quality monitoring algorithm based on distributed parallel semi-supervised Gaussian mixture model is proposed.In the quality-related subspace,A hierarchical quality monitoring algorithm is proposed from the variable level to the block level and the factory level(5)Aiming at the high-dimensional variable characteristics of large-scale industrial data,a distributed parallel probabilistic latent variable model is proposed under the framework of distributed parallel probabilistic modeling,and then applied to process monitoring and quality prediction under large-scale industrial data.And a plant-wide hierarchical monitoring algorithm is proposed for large-scale plant-level processes.Firstly,the large-scale industrial process is divided into several local unit blocks,and a distributed parallel mixture probabilistic latent variable model is established in each local block.Then,with Bayesian inference,the fault detection and diagnosis is realized from plant-wide level to the unit block level then to the variable level.It not only effectively alleviates the huge computational tasks of plant-level process modeling,but also helps to improve the accuracy of plant-wide fault detection and diagnosis.
Keywords/Search Tags:data driven modeling, large-scale industrial process, distributed parallel modeling, quality prediction, plant-wide process monitoring
PDF Full Text Request
Related items