Font Size: a A A

Research On Homology Analysis For Malware Based On Behavior Tree

Posted on:2021-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:S L YuFull Text:PDF
GTID:2428330611965691Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The explosive growth of malware has posed a huge threat to computer security,but most malwares are variants of known family malware,and their behaviors are similar.The implementation of behavior of malware needs the help of the API provided by system,the existing detection and homology analysis methods of malicious behavior are mainly based on API call sequence.These methods lack of API call control structure information for the definition of behavior.However,APIs under different control structures may have different behavior semantics,if the behavior cannot be defined accurately,it will lead to misjudgment of the malware family and reduce the effectiveness of malware homology analysis..Besides,malware authors can avoid the traditional sequence-based detection by adding noise APIs to the API call sequence.In view of the above problems,we propose a method of homology analysis for malware based on behavior tree(HAMBT,for short).The main contributions of this paper are as follows:(1)In this paper,we propose a behavior tree to represent the behavior model of malware,which can reflect the control structure information of system API.When malware calls the system API,there are sequence,parallel,loop,and exclusive choice relationships in API call sequence,which are the control structures of the system API.The behavior tree model contains these four control structure relationships,and it solves the shortcoming of existing related homology analysis methods based on API call sequence,which only consider the sequence relationship of API calls.(2)We propose a classification method for malware family based on behavior tree.First,a behavior tree mined algorithm is used to build the behavior model of malware,in order to discover the relation between behaviors,we extract the behavior patterns from the behavior tree to form the behavior feature of malware,this method has good anti-noise ability,under certain noise conditions,it can effectively avoid the interference of noise on the behavior feature.Secondly,in order to obtain statistically significant behavior patterns of malware in the same family to constitute family weighted behavior features,we use the occurrence frequency of behavior patterns in family and their average occurrence frequency in other families to measure the weight of behavior patterns.When calculating the similarity between the malware and the family in terms of behavioral attributes,we use the weight value of the behavioral patterns and the linear sequence expression of behavior tree to avoid graph matching problem.Based on the similarity between malware and each family,we constructthe similarity vector for malware,since there is no correlation between each dimension of the similarity vector,which is consistent with the assumption that there is no relation between the attributes of Naive Bayes classification algorithm in machine learning,we use Naive Bayes classification algorithm to train malware family classification model.Our experiments on public data shows that the family classification accuracy rate of HAMBT is 81.97%,which is 10% higher than the traditional methods based on API call sequence,HAMBT has high TPR and low FPR for large sample families.The average TPR,Precision,F1-score,and FPR of each family are 69.1%,69.4%,70.1%,0.45%.When there are noises in the API call sequence,HAMBT can still maintain a classification accuracy of81.11%,indicating that HAMBT has a good effect in homology analysis for malware.
Keywords/Search Tags:malware, homology analysis, behavior tree
PDF Full Text Request
Related items