| In recent years,the discussion about cyberspace security has become increasingly fierce,and the network security situation has become complex and severe.Web applications have played an extremely important role in daily life,such as social networking,online banking,e-commerce and so on.However,while bringing convenience to us,the number of vulnerabilities exposed by web applications shows a linear growth trend,which has become the main target of network attacks.Therefore,timely detection of malicious attacks through technical methods is a very important means.Taking the behavior dependency of malware as the starting point,Feature-based software semantic behavior filtering,and then closely combining the dynamic and static stain analysis of program code can effectively improve the vulnerability detection rate and the identification of malicious code family.This paper mainly carries out the research work from the following points:(1)Aiming at the problems of path explosion and false alarm in the generation of behavior dependency graph by function call of web application,a precise behavior dependency graph(Precise Behavior Dependency Graph,PBDG)method based on the extraction and verification of dependencies between malicious codes is proposed.Firstly,the behavior relationship of sensitive data is obtained through custom stain propagation rules for stain tracking,and then the blacklist of stain sources is used to filter and establish index files to improve storage space and instruction positioning ability.Secondly,the active variable path verification algorithm is used to reverse traverse the stain source→stain sink path generated by the index,and purify the false stains to further overcome the path space problem.Finally,combined with the path sensitive stain analysis method,we pay special attention to the function call process,and generate the precise behavior dependency graph of malicious code applied to malicious code identification and vulnerability analysis based on the stain file.It can effectively detect web application vulnerabilities and malware.(2)In order to solve the problem that malicious programs cheat detection through code obfuscation technology,a precise behavior dependency graph matching algorithm is proposed.The precise behavior dependency graph of the source program to be tested and the precise behavior dependency graph of malware that already exists in the malicious code base are matched by graph editing distance similarity.Firstly,the precise behavior dependency graph to be matched is divided into subgraphs(further divided into paths).The calculated graph editing distance is the sum of the matching of nodes and edges between the two graphs.The smaller the matching cost of nodes and edges,the smaller the calculated editing distance of the two graphs,the more similar the structural features of the two graphs and the greater the similarity.Finally,if the matching result of the two graphs is graph isomorphism,the matching is successful.Based on the precise behavior dependent graph matching algorithm,we can effectively identify the malicious code family.The experimental results show that the construction of precise behavior dependency graph can effectively improve the identification rate of malicious code,reduce the false negative rate of reporting vulnerabilities,and improve the detection rate of vulnerability detection.It provides a feasible way to solve the problems of false positive rate and effectiveness of malware,especially web vulnerability detection.The graph editing distance of the precise behavior dependency graph is calculated by the precise behavior dependency graph matching algorithm for graph matching.This method can effectively identify the malicious code family. |