| In recent years,Advanced Persistent Threat attack(APT attack)has become more and more intense.APT attacks are mainly continuous and effective attacks on specific organizations,such as governments,enterprises and companies.As people become more vigilant about executable files,APT organizations are gradually shifting their attack methods to making malicious documents.Spear phishing attacks are major attack method of APT organizations,which often cause major losses to mail users.Email readers are often less alert to the document files in attachments,it leads to attackers tilting their attacks on malicious documents files.Common documents such as PDF,WORD and other documents are becoming the main document format for people to exchange information on the Internet.However,many types of objects can be embedded in these documents,which can cause the document to produce malicious behaviors with various forms.Therefore,the malicious detection of documents becomes more important.This article takes the common documents in the network information exchange as the research object and studies the rapid detection method of malicious documents.This study proposes a multi-view detection framework on the basis of summarizing the domestic and international research status in recent years and more sample analysis,which provides a reference for the rapid judgment of the maliciousness of documents.In addition,relying on the framework to design and implement a document static detection system based on Web sites.The main research work and results are as follows:1.Propose a multi-view static detection scheme for document maliciousness:the scheme extracts feature from multiple views such as the standardization of the document,the error information of the document,the structure path of the document,and the number of objects,and uses machine learning algorithms to train the model;2.Feature engineering research was conducted for documents in PDF,Word,Excel,RTF,JPEG,PNG and GIF formats to determine the different features and weights of each type of document.3.The proposed detection model is tested on Word type documents,and a good detection rate(97.36%)and a lower false positive rate(0.27%)are obtained.Good experimental results have also been obtained in picture documents and other Office documents;comparing the single-view detection experiment with the multi-view detection experiment proves the effectiveness of the multi-view detection method.4.Design and implement a web-based document static detection platform.The platform can achieve fast static detection of common document files and complete sample detection and result query through API interface. |