Font Size: a A A

Research And Implementation Of Industrial Equipment Status Judgment Method Based On Data Stream Clustering

Posted on:2024-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:D ChenFull Text:PDF
GTID:2542306938951649Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the increasing digitization of industries,there is a massive influx of high-speed data streams in various aspects of industrial production,which are characterized by unlabeled,uneven distribution,and incremental arrival.Consequently,mining valuable information from industrial data streams has become a research hotspot.Clustering is an unsupervised data mining method that effectively partitions data sets without pre-labeling and pre-training models for massive data.As a branch of clustering,data stream clustering can incrementally process data based on traditional clustering and is more suitable for data mining analysis of industrial data streams.This thesis takes industrial equipment power parameter data as the research object,and proposes an improved data stream clustering method to cluster and analyses industrial equipment power data by analyzing the characteristics of industrial data streams,to realize the intelligent judgement of industrial equipment operation status.The main contributions of this thesis are mainly in the following three aspects.(1)For the characteristics of industrial electricity data with arbitrary shape distribution,based on traditional clustering,this thesis proposes an adaptive domain density peak clustering algorithm based on natural neighbors to improve the clustering performance and lay the theoretical foundation for the subsequent research in the field of data stream clustering.Firstly,the algorithm introduces the concept of natural neighbors to achieve the adaptive acquisition of data sample neighborhood information and calculates the local density and other metrics based on it,which avoids the clustering results being affected by parameters such as truncation distance and reflects the real distribution of data more accurately and naturally.Second,in order to improve the problem of manually selecting clustering centers that tend to ignore sparse cluster centers on data sets with uneven densities,an automatic clustering center selection strategy is proposed to expand the centroid selection range and improve the reliability of clustering center selection.Third,in the clustering label assignment stage,a two-stage assignment strategy based on natural eigenvalues is proposed,which defines the concept of dense points and completes their assignment tasks first to form the cluster prototype,and then uses the clusters belonging to the superior points to complete the assignment of the remaining points,which reduces the probability of chain assignment errors due to the wrong allocation of superior points.Finally,considering that the automatic cluster center selection strategy may lead to the generation of an initial number of clusters larger than the actual number of clusters,a sub-cluster fusion mechanism based on the similarity of nearest neighbors is proposed to continuously fuse the two most similar sub-clusters to obtain the final clustering results.Experiments show that the algorithm has excellent clustering performance and parameter robustness.(2)Aiming at the characteristic of incremental arrival of industrial power data,an improved data stream clustering algorithm based on dynamic weight and density peak is proposed based on the classic online/offline framework of data stream clustering,to achieve incremental data processing.In the online stage of data stream clustering,fully consider the location information of newly arrived data points,and assign initial weights to them by measuring the distance between the data points and the corresponding micro-cluster center.This method can accurately describe the micro-cluster information,consistent with the distribution characteristics of industrial power data.At the same time,the algorithm updates the micro-cluster pruning mechanism to enable timely capture of micro cluster role changes caused by weight changes.In the offline phase,an adaptive domain density peak clustering algorithm based on natural neighbors is used to cluster the data maintained in the online phase,improving the ability of the data stream clustering algorithm to handle arbitrary shape and density data sets.Comparative experiments in static and data streaming environments have verified the feasibility and effectiveness of the algorithm.(3)In the industrial environment,the improved method proposed in this thesis is used to cluster and analyze the electrical parameter data of industrial equipment,thereby determining the operating status of industrial equipment.On the preprocessing of industrial power consumption data,the algorithm parameters are debugged based on the power consumption parameter data collected at different industrial equipment points,and a targeted and efficient model is constructed.On this basis,incremental clustering analysis is performed on the industrial power consumption data,thereby achieving intelligent judgment of the operating status of industrial equipment.Finally,using the Django framework,a visualization platform for the operation status of industrial equipment is built to achieve industrial equipment status monitoring.In summary,the work in this thesis is a complete research process,ranging from industrial power data preprocessing to the design of traditional clustering algorithms for static data,to the design of data stream clustering algorithms based on traditional clustering.Finally,the improved algorithm is applied to the industrial environment to achieve the judgment of industrial equipment status.During the research process,corresponding solutions were proposed for the problems existing at different stages,improving the clustering performance of the algorithm and enhancing the adaptive ability of the algorithm parameters.Through comparative experiments,it was proved that the algorithm proposed in this thesis performs better in clustering indicators,can achieve accurate segmentation of data sets,and can be applied to practical engineering environments.
Keywords/Search Tags:Industrial power data, Data stream clustering, Density peak, Natural neighbors, Dynamic weights
PDF Full Text Request
Related items