| With the development of the Internet in recent years,all kinds of computer software have penetrated into people’s lives,bringing convenience to people’s lives,but also bringing complex and diverse network attacks,which make the information security of users and even countries suffer.In order to avoid threats,how to discover and patch vulnerabilities in a timely manner has become a hot research topic.Because most software source programs are closed,developers can only obtain the binary code of the software,the research on binary code vulnerability detection has important practical significance.For complex and diverse software vulnerabilities,traditional detection methods are not efficient and scalable,and machine learning is more and more widely used in the research on improving vulnerability detection.The existing machine learning-based binary code vulnerability detection methods have limited extractable code features and poor generalization ability of vulnerability detection.The accuracy of vulnerability location needs to be improved.Therefore,this topic proposes a dynamic and static combination based on machine learning.The binary code vulnerability detection method is designed and implemented,and a binary code vulnerability detection prototype system is designed and implemented.The main work of this paper is as follows:1.A feature analysis method based on Pin based on dynamic and static combination of program is proposed.Aiming at the grammatical and semantic information of the static feature response program and the runtime state information of the dynamic feature response program,the feature analysis of the combination of dynamic and static is carried out by using the secondary development of Pin and the high efficiency of instrumentation,and the feature extraction of the binary program is carried out.2.A program slicing method based on instruction length is proposed.Program slicing can not only learn fine-grained information at the instruction level,which is conducive to the location of vulnerabilities,but also learn richer structural information and overall characteristics of vulnerabilities at the coarse-grained level of multiple instructions.Aiming at the binary code vulnerability detection scenario and the research on the feature extraction method in this paper,a program slicing method based on instruction length is proposed.After the instrumented RTN is parsed,it is divided into code segments according to the instruction length,avoiding a large number of 0 or truncation caused by Inaccurate vulnerability detection occurs.3.Using the Addr2line tool to locate vulnerabilities reduces the time and labor cost of vulnerability troubleshooting.On the basis of ensuring the accuracy of vulnerability location,the scope of vulnerability location is reduced as much as possible.For programs with vulnerabilities,the binary address of the code segment with the vulnerability is found corresponding to the location in the source program through the Addr2line tool,and the address range of the instruction is traced to Identify the vulnerability location.4.A machine learning-based binary code vulnerability detection process is proposed,and a binary code vulnerability detection system is designed and implemented.Through a series of comparative experiments and analysis of experimental results,it is proved that the vulnerability detection method proposed in this paper has excellent performance in many aspects such as accuracy and precision,and can effectively detect and locate binary code vulnerabilities. |