Font Size: a A A

Research On Optimization And Implementation Of Dynamic Binary Translation In Low Voltage Processors

Posted on:2017-03-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z H LiFull Text:PDF
GTID:1108330488991032Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the spurring of IOT (internet of things) and mobile internet, the embedded ARM microprocessor becomes increasingly pervasive. On one hand, binary code compatibility has become a main obstacle for new processor architectures to enter into the embedded market. On another hand, the embedded processors are usually supplied with batteries, and are faced with serious problem of short working time. Dynamic binary translation (DBT) can not only address the binary code compatibility issue by providing cross-platform execution, but also optimize and tune a program at runtime according to the runtime information, which allows the processor to achieve higher performance and consume less energy. This works focuses on the main issues of DBT and proposes several optimize methods to speed-up DBT. Based on the cross-platform nature and dynamic-optimization nature of DBT, this work also combines the DBT with the state-of-art resilient technique to achieve a low supply processor with high energy efficiency. The main contributions are as follows:1. Translation algorithm based on characteristic of transfer instructions. This paper proposes a novel DBT algorithm composed of direct-mapping and transfer-type-inheriting mechanisms to efficiently handle transfer translations, based on the distinguishing characteristics between inner function and outer function transfer target address. Inner-function transfers are exactly mapped into the same type of transfers in source machine, which not only reduces transfers induced by conditional branches, but also avoids memory synchronizations. Outer-function transfers are dealt with differently to pass function call-return attributes from source code into translation code, thus improving hit rate of target machine branch predictor.2. Cache load balancing oriented dynamic binary translation. This paper proposes a hardware-software-codesigned DBT address relocation method that speeds up DBT performance by dynamically balancing load of instruction cache to data cache. Key idea of this method is the designing of the cache load balancing state for microprocessors. When working in this state, the data cache is divided into two areas: normal accessing area and load balancing area. The former caches regular program data as usual, while the latter is different. The load balancing area doesn’t cache regular program data, but supports load transforming channel, which is used to transform and assimilate most of the instruction cache load caused by scheduler of the DBT. Main work of the scheduler is converting jump target address from source machine code space to target machine code space.3. Low voltage resilient processor based on an offline error prediction model. This paper proposes a novel hardware-software co-designed low voltage processor to achieve one-cycle error correction, based on the locality and predictability of timing errors in a processor. Key idea of the proposed approach is the designing of an offline timing error prediction model based on dynamic binary translation. The model pre-detects most of the timing errors at compilation time, and pre-corrects them by embedding timing error alarms into the binary code through a specially designed timing error deletion programming interface. The proposed approach is able to eliminate most of the timing errors with small runtime overhead, which significantly improves the overall throughput and energy efficiency.4. Low voltage resilient processor based on light-weight dynamic optimization. This paper proposes a hardware-software co-designed low voltage processor to achieve high energy efficiency and throughput based on the instruction-level locality of timing errors. Key idea of the collaborative approach is the designing of a light-weight two phase optimizer based on dynamic binary translation. The optimizer works between the hardware layer and the application layer, and is able to collect error information of a program at runtime, and generate timing error alarms before each hot-errant instruction at idle time. In the future execution, if the instruction is found to bring about timing errors again, the optimizer activates the corresponding timing error alarm at once, which informs the hardware to take measures and eliminate future errors of the instruction. The approach is able to eliminate more than 95% of the timing errors with minimum runtime overhead, which not only reduces the performance and energy overhead in error correction, but also alleviates pressure on the EDAC (error detection and correction) circuits, thus providing stronger error-tolerant ability and achiving higher performance and energy efficiency.
Keywords/Search Tags:Dynamic Binary Translation, Low Voltage Processor, Computer Architecture, Data Cache, Resilient Error, Transfer Instruction, Hardware-Software Co-design
PDF Full Text Request
Related items