With the rapid development of chip manufacturing technology,the feature size and working voltage of modern components are getting smaller,which indicates the chips can be easily affected by soft errors.Soft errors can lead to silent data corruption(SDC),during the propagation of SDC warnings are never reported,so SDC is the most difficult to detect.The key challenge for error propagation is the analysis of control flow propagation,because it is difficult to describe the change of program execution caused by control flow propagation explicitly.The propagation analysis of control flow is transformed into the prediction of fault tolerance branch by researchers.Fault tolerance branch refers to the wrong branch that will not lead to SDC.For the prediction of fault-tolerant branches,the existing methods identify fault-tolerant branches by defining their structural features.However,because fault-tolerant branches are related to the complex semantics of the program,the defined features cannot cover all possible fault-tolerant branch structures,resulting in low detection accuracy.In this paper,based on graph representation learning method,the complex semantics of program context are modeled,and the latent characteristics of branch structure are automatically learned,so that fault tolerant branches can be predicted quickly and accurately.Finally,this paper uses fault tolerance branch and analysis model to detect the SDC-causing instructions in the program,and protect the vulnerable instructions by instruction duplications,thus improving the overall reliability of the program.The main research work of this paper is as follows:(1)A program semantic representation method based on LLVM intermediate representation is proposed.To fully represent the structural information of the program,this paper extracts the control flow graph,data flow graph and call flow graph from the intermediate representation of LLVM and transforms them into the information flow between basic blocks.Meanwhile,the information about the instruction type and operand in the basic block is extracted,which provides a method support for describing various types of error propagation.(2)A fault tolerant branch prediction method based on graph attention network is proposed,which transforms the task of predicting fault tolerant branches into the task of classification of basic blocks of programs.Graph attention network allows nodes in the same neighborhood to be assigned different importance,so that different types of error propagation effects between nodes can be learned.The importance of nodes is quantified by graph attention network,and the latent characteristics related to error propagation are automatically captured to predict the category of basic blocks.The experimental results show that the average F1 score on the test program is 0.85,and the time cost is 0.57 times less than that of the fault injection method.(3)A software protection method based on fault tolerance branch is proposed.The propagation model is constructed by fault tolerance branch,the probability of SDC caused by instructions is calculated,SDC-causing instructions are selected and instruction replication is deployed to protect the program.The research results show that the error detection rate of SDC reaches 80.6% with the cost of 51.86% induced by duplication instructions,which improves the protection of SDC at a low cost. |