Research On Detection Algorithm For Malicious Word And PDF Documents

Posted on:2018-08-28

Degree:Master

Type:Thesis

Country:China

Candidate:X D Tian

Full Text:PDF

GTID:2348330518499445

Subject:Engineering

Abstract/Summary:

With the rapid development of computer networks,more and more people begin to focus on protecting their personal privacy and important data.However,the emergence of a variety of malicious documents has brought great harm to the people’s life,especially Microsoft Word and PDF documents which are editing and viewing software often be used.They become the target,and malicious attacks which use the defects of document come out one after another,the number of attacks increases dramatically and bring irreversible losses to users.As a result,if we can design a malware detection algorithm for suspicious documents,the harm of malicious Word and PDF documents will be relieved greatly.Focus on the above problems,this thesis introduces some security background and common attacks of malicious Word and PDF document,then we describe the newly research situation of document detection.The shortcoming of known static detection is the low accuracy while dynamic detection is the long detecting time.Machine Learning has a powerful ability to learn from data,and can get the hidden statistical rules,thus,more and more security researchers try to use Machine Learning in malware detection.Based on the existing research,this thesis proposes two faster and more effective algorithms using Machine Learning:1)Dynamic Detection of Malicious Word and PDF Based on API Behavior and Deep Learning Model Inception V3Sandbox technology is a typical dynamic technology most commonly used,but it is based on time overhead and virtualization instruction system.Using the improved Cuckoo sandbox,this thesis designs a malicious documents detection algorithm based on Deep Learning model Goog Le Net Inception V3.The results of Cuckoo Sandbox with documents running in it are abstracted according to API dependency,then transfer the document feature vector to two-dimensional image.The Inception V3 network will extract the Bottleneck feature while image is inputed,then train the classifier using transfer learning and the detection is ended.Experiments shows that this detection algorithm has achieved a good time performance in unknown malicious Word and PDF,and the detection rate has reached 89.1%.2)Static Detection of Malicious PDF Based on K-means Cluster and Deep Text Feature Detection NetworkTraditional Static detection of PDF is generally aimed at a specific attack,and the detection rate is too low.In view of these problems,this thesis designs a static detection algorithm for PDF,the algorithm includes two aspects: the extraction of the distinguishing text features based on K-means;the classification based on the deep text feature detection network.The extraction uses PDFMiner and K-means clustering to get the distinguishing text feature between malware and benign,meanwhile,the deep text feature detection network is a designed 15-layer deep linear neural network.Experiments show that this detection algorithm achieves a good result on unknown malicious PDF document and the detection rate has reached 86.6%,at the same time,it can also detect malicious PDF under different attacks effectively.

Keywords/Search Tags:

Malicious Document, Tensor Flow, Deep Learning, Static Analysis, Dynamic Analysis

Related items

1	Combining Static And Dynamic Analysis Of The Malicious PDF Document
2	Android Security Threats Analysis Based On Dynamic And Static Taint Flow
3	Design And Implementation Of Malicious Pdf Document Detection System Based On The Static Analysis Technology
4	Research And Implementation Of Malicious PDF Document Detection Technology
5	The Design And Implementation Of A Static Analysis Based Malicious App Detection Tool
6	An Android Malware Detection Method Based On Deep Learning Of Dynamic And Static Features
7	An Android Malware Detection Model Based On Tensor Decomposition
8	Web Malicious Script Detection Technology Research Based On Dynamic And Static Analysis
9	Research On Detection And Protection Of Malware On Android System
10	Research On Android Malicious Software Detection Based On Deep Learning