Font Size: a A A

Study Of Methods And Theories Based On Knowledge Extraction From Big Data Of Process Object

Posted on:2018-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:T L ZhuFull Text:PDF
GTID:2348330512981827Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Sampled data in process industry system accumulates to form a huge history database over a long period of time.Sampled data contains a large amount of valuable information and knowledge waiting to be excavated.As the production process of the process industry emphasizes the integrity and real-time,it is necessary to study the knowledge discovery which regards the complete industrial production process as the research object from the overall and systematic point of view.In the research of power system knowledge discovery platform,a calculation model for knowledge discovery of large data set of process object named T-C-AC/T Flow algorithm is presented by our research group.The algorithm stream includes data preprocessing,temporal discovery,clustering,and association rule generation.And then,association chains and state association chains between links of process object are obtained through a series of calculations.At present,some problems in data sampling,temporal discovery,clustering and correlation analysis still exist in T-C-A-C/T algorithm stream need further research.This paper mainly aims at solving the problems existing in the algorithm flow.In the aspect of data sampling,an algorithm of data sampling is designed based on variance.This algorithm calculates the variance of the data contained in different data segments.The data segment with the largest variance is taken as the sample data segment.The effect of data sampling is demonstrated by experiments.In the aspect of temporal discovery,a theoretical analysis of the temporal relationships among each links of process object is performed based on the theory of computer control system.Meanwhile,the algorithm of timing calculation based on statistical extremes is improved and the applicable conditions of the algorithm are analyzed.The algorithm calculates delay time between two links by counting the time distance between extremes of two links.And it takes one link as the basic link to calculate the delay time from other links to the basic link.Then,the algorithm get the timing relationship between links of process object.Experiments show that the timing relationship between links of process object can be obtained accurately.In the aspect of data cluster,a clustering algorithm based on time series subsequence segmentation is designed in order to optimize the division of state.In the algorithm,the original sampling time series is divided into several subsequences by sliding window.And then the K-means algorithm is used to cluster for subsequences after subsequences normalized.We select the best k value of clustering using the evaluation criteria based on contour coefficient.Through experiments,the clustering effect is demonstrated.In the aspect of correlation analysis,a theoretical analysis about the relationship between links of process object is given,which provides theoretical support for mining the association relationships containing timing characteristic between links of process object.And this paper also provides a correlation analysis between links of process object based on results of subsequences clustering.The binomial association rules between different clusters are extracted based on Apriori algorithm.The correlation degree between links of process object is determined according to the support and interest of the rules.Then,the correlation chains between links of process object are generated based on the correlation degree.This chains is a strong correlation between the internal links of process industry.The state chains can be obtained based each correlation chains and state category of each link in the chains.The interaction relationship between different states of each link in the chains can be expressed by state association chains.We can find out the correlation relationship and the states correlation relationship between the internal links of process objects in experiments based on a part of the sampled data of a power system.Finally,the state association chains can be used to guide production of process industry,which has great significance to the production,control,and management of process industry.
Keywords/Search Tags:process object, temporal discovery, subsequence clustering, association rules, association chains, state association chains
PDF Full Text Request
Related items