Font Size: a A A

Research On Security And Crash Consistency Of Memory Architecture For Heterogeneous System

Posted on:2022-02-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:B DiFull Text:PDF
GTID:1488306731983509Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Due to high performance,heterogeneous systems gain rapidly-growing popularity in many areas.However,the memory architecture of heterogeneous systems is subject to security issues and crash consistency.In graphics processing units(GPUs),a malicious kernel can leverage buffer overflows to corrupt data of a benign kernel,change the execution flow of a benign kernel,and even execute malicious codes.A malicious kernel also can leverage the TLB to achieve denial-of-service(Do S)attack,which continuously evicts page table entries(PTEs)to degrade the performance of co-running benign kernels.For persistent memories(PMs),when a failure happens(e.g.,system crash of power failure),data in the cache is lost because it is not flushed to PMs.We call this issue as a crash consistency issue,and it can damage the reliability of PMs.For example,a crash consistency issue can cause data lost which changes the execution flow of the program and even leak privacy information.To address these challenges,we present main contributions of this work:First,we explore the memory overflow on CUDA platform and results show that the memory overflow can happen in the same kernel,between concurrent kernels based on streams,between concurrent kernels based on CPU thread,and concurrent kernels based on CPU processes.This indicates a high risk of corrupting data or crashing whole system by overwriting function pointers.Second,we introduce GMOD,a runtime software system that detects GPU buffer overflow.GMOD performs always-on monitoring on dynamically allocated buffers based on a canary-based design.To enable high performance,GMOD introduces a set of byte arrays to store buffer information for buffer overflow detection.In addition,GMOD also introduces several techniques,such as lock-free accesses to the byte arrays,delayed memory free for high performance memory deallocation,and efficient memory reallocation and garbage collection for the byte arrays.Our experiments show that GMOD is capable of detecting buffer overflows at runtime and has small runtime overhead(2.9% on average and up to 9.1%).Third,we introduce GMODx,which utilizes unified memory to delegate the always-on monitoring to the CPU.To reduce performance overhead,we propose several techniques,including customized list data structure and specific optimizations against the unified memory.Furthermore,the runtime is encapsulated within a dynamic shared library that interposes the memory allocation APIs,making it possible to offer protections transparently without any changes to applications.In our evaluations,GMODx incurs small runtime overhead(3.5%on average and up to 13.6%).To evaluate GMODx with real workloads,we deploy GMODx on the Tensor Flow framework,and it only causes 0.8% overhead on average(up to 1.8%).Fourth,based on the micro-architecture properties of GPU,we introduce a softwarebased system,TLB-pilot,which binds thread blocks of different kernels to different groups of streaming multiprocessors by considering hardware isolation of last level TLBs and application's resource requirement.TLB-pilot employs lightweight online profiling to collect kernel information before kernel launches.By coordinating software-and hardware-based scheduling and employing a kernel splitting scheme to reduce load imbalance,TLB-pilot effectively mitigates TLB attacks.Results show that under TLB attack,TLB-pilot mitigates the attack and provides on average 56.2% and 60.6% improvement in average normalized turnaround times(ANTT)and system overall throughput(STP)respectively.Fifth,we propose PMDebugger,a debugger to detect crash consistency bugs.Unlike prior work,PMDebugger is fast,flexible,and comprehensive for bug detection.The design of PMDebugger is driven by the characterization of how three fundamental operations in PM programs(store,cache writeback and fence)typically happen in PM programs.PMDebugger uses a hierarchical design composed of PM debugging-specific data structures,operations and bug-detection algorithms(rules).We generalize nine rules to detect crash-consistency bugs for various PM persistency models.Compared with a state-of-the-art detector(XFDetector)and an industry-quality detector(Pmemcheck),PMDebugger leads to 49.3x and 3.4x speedup on average.Compared with another state-of-the-art detector(PMTest)optimized for high performance,PMDebugger achieves comparable performance,without heavily relying on the programmer's annotation.PMDebugger also identifies more bugs than XFDetector,Pmemcheck and PMTest.PMDebugger detects 19 new bugs in a real application(memcached)and 2 new bugs from Intel PMDK.
Keywords/Search Tags:Heterogeneous System, GPU, Persistent Memory, Security, Crash Consistency Issue
PDF Full Text Request
Related items