Static binary translation is a technique that allows an executable program for a given architecture to be translated into a different one, with a reduced overhead compared to emulators and dynamic binary translators. The main downside of the static approach lies in the absence of runtime information, which is available in other solutions. In particular, one of the key issues consists in the identification of data and code in the program, and, more specifically, in the detection of basic block start addresses (jump targets). The presence of indirect jump instructions whose target is not immediately evident, in particular due to C switch statements, makes the recovery of jump targets a challenging task. In this paper, we present an effective technique for jump targets identification composed by an initial step of global data harvesting followed by two novel analyses: the Simple Expression Tracker and the Offset Shifted Range Analysis. Both analyses work on a Single Statement Assignment (SSA) intermediate representation and are iterated multiple times until they provide no additional information. In particular, OSRA is a data-ow analysis modeled after the typical code generated for switch statements. It tracks each SSA value in terms of an offset, a scaling factor, and another SSA value, comprised between a lower and an upper bound (e.g., b = 10 + 4 · x, with 8 ≤ x ≤ 10). To validate the effectiveness of the proposed technique, we employ revamb, an in-house tool for binary translation leveraging QEMU and the LLVM compiler framework. Our experimental results show that we are able to run the coreutils test suite on ARM, MIPS and x86-64 without significant failures due to unidentified jump targets.
A jump-target identification method for multi-architecture static binary translation
DI FEDERICO, ALESSANDRO;AGOSTA, GIOVANNI
2016-01-01
Abstract
Static binary translation is a technique that allows an executable program for a given architecture to be translated into a different one, with a reduced overhead compared to emulators and dynamic binary translators. The main downside of the static approach lies in the absence of runtime information, which is available in other solutions. In particular, one of the key issues consists in the identification of data and code in the program, and, more specifically, in the detection of basic block start addresses (jump targets). The presence of indirect jump instructions whose target is not immediately evident, in particular due to C switch statements, makes the recovery of jump targets a challenging task. In this paper, we present an effective technique for jump targets identification composed by an initial step of global data harvesting followed by two novel analyses: the Simple Expression Tracker and the Offset Shifted Range Analysis. Both analyses work on a Single Statement Assignment (SSA) intermediate representation and are iterated multiple times until they provide no additional information. In particular, OSRA is a data-ow analysis modeled after the typical code generated for switch statements. It tracks each SSA value in terms of an offset, a scaling factor, and another SSA value, comprised between a lower and an upper bound (e.g., b = 10 + 4 · x, with 8 ≤ x ≤ 10). To validate the effectiveness of the proposed technique, we employ revamb, an in-house tool for binary translation leveraging QEMU and the LLVM compiler framework. Our experimental results show that we are able to run the coreutils test suite on ARM, MIPS and x86-64 without significant failures due to unidentified jump targets.File | Dimensione | Formato | |
---|---|---|---|
a17-di_federico.pdf
Accesso riservato
Descrizione: Main article
:
Publisher’s version
Dimensione
364.6 kB
Formato
Adobe PDF
|
364.6 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.