Font Size: a A A

Research On File Fragment Type Detection Algorithm In Digital Forensics Based On Self-Attention Mechanism

Posted on:2022-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y M SunFull Text:PDF
GTID:2518306533954939Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Digital forensics is a method of forensics centered on data evidence.As an important research content in the field of information security,digital forensics usually includes 3 stages: forensics,analysis and reporting.Forensics is mainly to collect electronic data evidence,such as images in memory,Internet history,and actual files on the drive;analysis refers to the use of many different methods and tools to recover evidence materials,common methods include the entire digital media(files And unallocated free space)to perform keyword search,restore deleted files and extract registry information(for example,list user account equipment);the report is the behavior of forensic results obtained through analysis to reveal illegal issues related to digital technology products Or negligence.With the rapid development of related technologies,digital forensics has been widely used in legal cases.In the process of digital forensics,various types of digital files of corresponding materials,such as documents,pictures,audios,etc.,need to be extracted.However,these materials are often easily overlooked,resulting in incomplete or maliciously damaged.In the process of data forensics recovery,if you want to improve the accuracy of data recovery,you must first detect the types of file fragments in digital forensics.So as to realize the optimization of the accuracy of digital forensics.However,the difficulty lies in the fact that only fragments are ignored or maliciously destroyed,resulting in less file meta-information that can be relied on for type detection,which reduces the accuracy of file fragment type detection.When general file recovery methods are not effective,file classification helps file recovery.Common file recovery methods use file signature databases such as extensions or magic numbers to identify the headers of known file types,and then recover some continuous disk space.However,when files are unallocated,disk space is missing,and fragmented and reliable file system metadata is unavailable,general file recovery methods will not be able to recover complete files.Analysts need an automated method to classify fragment file types to facilitate the recovery of complete files,that is,file fragmentation type detection methods.File fragment detection algorithm,as a key technology in digital forensics,plays a vital role in the detection of criminal cases.In the past,the research on file fragment type detection in digital forensics was mainly based on traditional machine learning,neural network and gray-scale image conversion.The file fragment type detection was mainly based on the extraction of N-Gram,Shannon entropy and other characteristics.These research methods are more inclined to statistical analysis of features in feature extraction,and do not consider the similarity relationship between fragment content.Therefore,the accuracy and precision of existing file fragment type detection algorithms are not high.Aiming at the above difficulties,this paper proposes a file fragment type detection algorithm in digital forensics based on self-attention mechanism.The algorithm is implemented by defining a deep neural network framework,which includes an input layer,an embedding layer,a multi-head attention layer,a global average pooling layer,a dropout layer,and a fully connected layer.For different file fragment types,the self-attention mechanism can learn the correlation between bytes,preserve the context relationship between bytes,and learn the byte characteristics of different file types.This article uses the general data set Gov Docs as the experimental data set.First,the obtained file fragments are divided into blocks of byte size,usually 512 bytes or 4096 bytes,and then converted into an input matrix,which is converted into an input matrix by an embedding layer.Continuous vector,learn the similarity characteristics between bytes in the multi-head attention layer,extract the feature vectors of different file types,and then reduce the overfitting degree of the model through global average pooling and Dropout layer,and finally classify through the fully connected layer Get experimental results.Through the analysis of experimental results,our proposed self-attention mechanism-based file fragment type detection algorithm in digital forensics has achieved good experimental results,with an accuracy rate of 81.2%.At the same time,it also compares related algorithms at home and abroad,showing the algorithm in this paper.Validity and accuracy.
Keywords/Search Tags:Digital forensics, File fragment type detection, Self-attention mechanism, Deep learning
PDF Full Text Request
Related items