Font Size: a A A

Research On The PDF Document Security Detection Methods

Posted on:2016-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:B Y SunFull Text:PDF
GTID:2308330476953380Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years, the PDF document format as an electronic file format, has become the mainstream. Since 2008, when the Adobe Reader the first case of key exploits were found out, a growing number of PDF files has become an important means of attack. But compared with other JavaScript mode of attack, attack based on PDF does not cause a lot of attention in the research area. In this research, we carried out the study of PDF document security.This article first introduces the background of the research on PDF document security and development present situation, from the pure static side, the pure dynamic side and the combining both sides methods. Then we introduced the PDF format and PDF document security problems, the composition of each part are in detail described and introduced. For a PDF of security issues, JavaScript is the main issue in a PDF document analysis. This part focuses on the PDF document security problem.In terms of static detection, this paper introduces the principle of static testing PDF document security and static detection scheme for the implementation and improvement. Firstly from the PDF document to extract of JavaScript code, by adding a certain amount of de-confusion in the process of extracting measures, the corresponding JavaScript code is extracted from the PDF document. The feature analysis is more accurate. Considering the particularity of the safety of the PDF document problem, we design the derivative model of support vector machine(SVM), the establishment of a more perfect machine learning model. And by the addition of sub models, the attack mode of malicious PDF document classification is more sufficient. Compared with the traditional scheme, this kind of static detection scheme can improve the static detection accuracy, and can provide more effective information.In the aspect of dynamic detection, in this paper, the principle of dynamic testing the safety of the PDF document is introduced and we establish a complete dynamic detection system. First we use the simulator libemu for those PDF documents which can extract the Shellcode for direct detection. For other types of documents, we then use a Sandbox mechanism, the Cuckoo Sandbox for a detailed analysis of the behavior. Due to the fully used of the static detection result and joined the simulator mechanism, and the use of pure sandbox compared dynamic testing for the safety of the PDF document, we can take the advantage of dynamic detection high accuracy and can reduce the testing time with the improvement the detection efficiency.At the end of the paper, the PDF document security detection system is introduced and implemented, and tested with the PDF document samples collected from the internet. It can be seen from the experimental results that the system rapidly detect the malicious PDF document with high accuracy.
Keywords/Search Tags:PDF documents, static detection, machine learning, dynamic detection, behavior detection
PDF Full Text Request
Related items