We present techniques for eliminating dispatch overhead in a virtual machine interpreter using a lightweight just-in-time native-code compilation. In the context of the Tcl VM, we convert bytecodes to native code, by concatenating the native instructions used by the VM to implement each bytecode instruction. We thus eliminate the dispatch loop. Furthermore, immediate arguments of bytecode instructions are substituted into the native code using run-time specialization. Native code output from C compiler is not amenable to relocation by copying; fix-up of the code is required for correct execution. Resulting code size increase is apparently impractical. We evaluate performance using hardware performance counters and system simulation. Some benchmarks achieve up to 50% speedup, but roughly half slow down, or exhibit little change. Most slowdown is attributable to I-cache overflow due to increased code size, and increased compilation time. Larger I-caches broaden applicability of technique. |