Font Size: a A A

Research On Software Supply Chain Pollution Detection

Posted on:2020-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:Z H WuFull Text:PDF
GTID:2428330620953250Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the problem of software supply chain pollution has become increasingly serious,posing a great threat to users' privacy and property security.Attackers exploit security vulnerabilities in the production,delivery,or usage of software to attack users quietly by hijacking or tampering with legitimate software on the software supply chain.However,few studies are focusing on software supply chain security problems,and most of the existing work elaborates on the problems of software supply chain pollution,pollution detection and pollution prevention from a macro level.What's worse,few studies propose a systematic detection solution.In response to the above problems,this thesis systematically studies the pollution detection methods from the downstream of the software supply chain.The main work and contributions are as follows:1.At present,the software supply chain pollution detection work relies on the program reverse analysis technology,but there is no work to summarize its research progress.Aiming at this problem,this thesis analyzes more than 100 software supply chain security incidents at home and abroad,summarizes the research status of program reverse analysis technology in software supply chain pollution detection,and points out the problems still existing in the application of automatic program analysis technology to software supply chain pollution detection.2.In view of the current pollution detection in software supply chain is less,and the related research is not systematic and in-depth,this thesis deploys pollution detection research in the downstream of software supply chain;starting with both pollution target and pollution approach,and proposes software supply chain pollution classification model and analyzes the special problems in software supply chain pollution detection in detail;and proposes a software supply chain pollution detection framework,which divides the pollution detection problem in the downstream of the software supply chain into three categories: detecting installation package bundle installation,detecting third-party vulnerability or malicious module reuse,and detecting malicious code embedded in the target programs.3.The pollution detection technology in the downstream of the software supply chain needs to release the executable files.However,the current automatic installation scheme for the software installation package has poor universality and low success rate.Aiming at this problem,this thesis proposes a reliable incremental software automatic installation method,which is installed in a way based on silent installation parameters,components identification,and OCR recognition.The method realizes a highly automated software installation and extracts the final executable program.The experimental results show that for more than 10,000 categories and over 30,000 software installation packages collected,the scheme can successfully install more than 95% of the samples,laying the foundation for the realization of specific detection methods in the pollution detection framework.4.According to the three categories of pollution detection problems proposed in the software supply chain pollution detection framework,this thesis analyzes the shortcomings of the existing technologies in detail and proposes systematic solutions:a)In view of the poor universality of the existing installation package bundled detection scheme,this thesis proposes an installation information and machine learning based approach to detect the bundle behavior in the installers.It first collects the behavior information and the released file information during the software installation process;then detects malicious code using the LSTM(long-short-term memory neural network)model with attention mechanism;finally detects benign software bundle using software installation directory and software family classification results.The experimental results show that the model can effectively detect the software installation package with bundle behavior.b)In view of lacking effective and precise third-party module reuse detection methods,this thesis proposes a hash comparison and function similarity analysis based approach to detect the reused modules in the software.It separately detects binary file multiplexing and source code multiplexing.The binary file multiplexing detection is realized by hash comparison;the source code multiplexing detection is realized by the cross-platform and cross-compiler binary function similarity analysis.Experiments show that this scheme can effectively find the module reuse problem in software binary programs.c)For the problem that the embedded malicious code is concealed,the dynamic analysis coverage is limited,and the in-depth detection is time-consuming,this thesis proposes an embedded malicious code detection scheme based on the differential thought.First,the software is classified into families,then the software lineage analysis is performed on the software in the same family.Then the deep neural network DNN is used to judge the difference between adjacent versions of the software.Finally,the differential code in the target program is in-depth analyzed.The experimental results show that the software family classification and software lineage analysis schemes of this thesis are fast and effective,which improves the overall efficiency of malicious code detection.
Keywords/Search Tags:software supply chain, pollution detection, program analysis, automated installation, malicious code detection
PDF Full Text Request
Related items