Font Size: a A A

Research On Causal Structure Learning Algorithm Based On Streaming Features

Posted on:2019-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:X X GuoFull Text:PDF
GTID:2428330548985935Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The goal of causal structure learning is to obtain a causal model that reflects the objective relationship between things by training a large amount of data.Mining the causal structural relationships contained in the data,which provides ideas and evidence for cognizing the laws of development of things and exploring the essential factors that cause changes,and make more informed decisions.That is an important research direction in the field of data mining.However,research on this issue of causal structure learning is still confined to the static feature space so far.With the dynamic change and the rapid growth of data,the method based on static feature space obviously is becoming unadaptable and infeasible in space and time.In order to apply the causal structure learning algorithm to the dynamic feature space as effectively and efficiently,this paper mainly studies the causal structure learning method in the dynamic feature space.At the same time,we ensure that this method is also suitable for the causal structure learning problem in the static feature space,which is more efficient than the exsiting method.Specifically,the research work of this paper is as follows:Firstly,aiming at the current situation that the existing causal structure learning algorithms can not adapt to the dynamically changing of feature space,we study the key theories and techniques in causal structure learning and dynamic feature space,model the dynamics of feature space with the concept of streaming features,and then proposed a kind of framework about causal structure learning based on streaming features.In this framework,for each new feature,we first select its possible candidate neighbors to reduce the space of candidate neighbors among the arrived features.then further filtration is performed based on these possible candidate neighbor nodes to obtain its candidate neighbor node.For each new feature,the above process is performed.When no new feature arrives,we begin to conduct limited greedy search to obtain the final network structure.Secondly,according to the above framework,we first propose a algorithm called CSBS algorithm that is a causal structure learning algorithm using the chi-square test based on streaming features.For a new feature,CSBS first performs unconditional independence test with each feature that has arrived based on G2 test to exclude independent relationship features.For these features with dependent relationship,we perform further conditional independence tests.After the above conditional independence test,we obtain its candidate neighbor features for each feature that has arrived.Finally,when no feature arrives,a greedy search is performed based on the candidate neighbor matrix to determine the direction of edges and obtain the final network structure.The experimental results prove effectiveness of the algorithm for the problem of causal structure learning in streaming features context and also testify rationality and feasibility of the framework proposed.Finally,we propose another causal structural learning algorithm with streaming features named CSSU algorithm that is based on the mutual information.CSSU calculates the mutual information value with each feature that has arrived and by the predefined mutual information threshold to filter features as its possible candidate neighbor features.For these features satisfying the threshold,we further calculate and compare the mutual information to obtain the candidate neighbor features of the feature.Compared to conditional independence test based on subsets,mutual information comparation based on individual features actually optimize the exponential level computational complexity to the linear level.The experimental results show that the algorithm outperforms other algorithms used in our experiments in terms of accuracy and time performance.
Keywords/Search Tags:Streaming features, Causal structure learning, Bayesian network, Candidate neighbors
PDF Full Text Request
Related items