Inferring gene regulatory network is a key step in understanding biological processe, which downstream research can provide some meaningful strategy in drug repositioning and precision medicine. Microarray technique generates a large amount of gene expression data, providing a workable data foundation. Many computational methods were developed to infer gene regulatory networks including cluster, linear programming, one of the most effective methods is combine path consistency(PC) algorithm with conditional mutual information(CMI), because CMI has many advantages in measuring non-linear dependence to infer regulatory networks(which is widely existed in biology), and can discriminate the direct regulations from indirect ones. PCA-CMI and CMI2 NI, which is widely used in reconstruction of gene regulatory networks based on this idea. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm.In this study, we firstly analyze the reason that why PC algorithm and conditional mutual information can reconstruct gene regulatory network effective, and then analyze the disadvantage of current algrorithm from theroy and cases respectively. Combining biological fact, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. Detect candidate pattern only need ensure upstream and downstream rather than direct upstream and downstream interaction, that explain the reason why the result of upstream and downstream interaction cannot used for inferring gene regulatory network directly, but it just meet the requirement of detecting candidate pattern. At last, we develop a novel selection of candidate gene-based algorithm, namely RPNI(Regulation Pattern based Network Inference), to infer gene regulatory networks.To demonstrate the potential of our algorithm, we apply it to gene expression data in DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in accuracy, time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise. We also demonstrate its effectiveness in acute myeloid leukemia(AML) RNA sequencing data from TCGA, By analyzing the twodifferential target gene sets usingcancer gene annotation system Ca Ge,we noticed thatthe target gene set inferred by RPNI is more significantly enriched for AML cancer pathways than that inferred by CMI2 NI. At last we discuss some commen algorithm that ensure the edge direction.Above all, candidate pattern proposed by this paper can slove the challenge that how to determine the conditional gene. besides, candidate pattern can not only apply in conditional mutual information but also can generalize to any gauge of conditional dependency. |