
LuaDec, Lua bytecode decompiler
Overview
LuaDec is a decompiler for the Lua language. It takes compiled Lua bytecodes and attempts to produce equivalent Lua source code on standard output. It targets Lua 5.0.2.
The decompiler performs two passes. The first pass is considered a "preliminary
pass". Only two actions are performed: addresses of backward jumps are gathered
for while
constructs, and variables referenced by CLOSE
instructions are taken note of in order to identify the point where explicit
do
blocks should be opened.
The actual decompilation takes place in the second, "main pass". In this pass, the instructions are symbolically interpreted. As functions are traversed recursively, the symbolic content of registers are kept track of, as well as a few additional data structures: two register stacks, one for "variable registers" (registers currently allocated for variables) one for "temporary registers" (registers holding temporary values); and a list of boolean conditions.
Instructions like ADD
and MOVE
combine the string
representation of symbols and move them around in registers. Emission of actual
statements is delayed until the temporaries stack is empty, so that constructs
like a,b=b,a
are processed correctly.
The list of boolean conditions accumulate pairs of relational operators (or
TEST
s and jumps). The translation of a list into a boolean expression
is done in three stages. In the first stage, jump addresses are checked to identify
"then" and "else" addresses and to break the list into smaller lists in the case of
nested if
statements. In the second stage, the relations between the
jumps is analyzed serially to devise a parenthization scheme. In the third scheme,
taking into account the parenthization and an 'inversion' flag (used to tell
"jump-if-true" from "jump-if-false" expressions), the expression is printed,
recursively.
Two forms of "backpatching" are used in the main processing pass: boolean conditions
for while
constructs are inserted in the marked addresses (as noted in
the first pass), and do
blocks are added as necessary, according to the
liveness information of local variables.
LuaDec is written by Hisham Muhammad, and is licensed under the same terms as Lua. Windows binaries contributed by Fabio Mascarenhas.
Status
LuaDec, in its current form, is not a complete decompiler. It does succeed in decompiling most of the Lua constructs, so it could be used as a tool to aid in retrieving lost sources.
The situations where LuaDec "get it wrong" are usually related to deciding
whether a sequence of relational constructs including TEST
operators
are part of an assignment or an if
construct. Also, the "single pass"
nature of the symbolic interpreter fails at some corner cases where there simply
is not enough information in the sequence of operator/jump pairs to identify what
are the "then" and "else" addresses. This is an example of such a case:
1 [2] LOADNIL 0 2 0 2 [3] JMP 0 16 ; to 19 3 [4] EQ 0 1 250 ; - 2 4 [4] JMP 0 2 ; to 7 5 [4] TEST 2 2 1 6 [4] JMP 0 5 ; to 12 7 [4] EQ 0 1 251 ; - 3 8 [4] JMP 0 3 ; to 12 9 [4] LOADK 3 2 ; 1 10 [4] TEST 3 3 1 11 [4] JMP 0 0 ; to 12 12 [6] LOADK 0 2 ; 1 13 [7] JMP 0 7 ; to 21 14 [8] LOADK 0 0 ; 2 15 [8] JMP 0 3 ; to 19 16 [10] LOADK 0 1 ; 3 17 [11] JMP 0 3 ; to 21 18 [12] LOADK 0 4 ; 4 19 [3] TEST 1 1 1 20 [3] JMP 0 -18 ; to 3 21 [14] RETURN 0 1 0
local a, x, y while x do if ((x==2) and y) or ((x==3) and 1) or 0 then a = 1 do break end a = 2 else a = 3 do break end a = 4 end a = 5 end
Notice that there is no reference to the "else" address in the highlighted sequence.
Only with a more sophisticated block analysis it would be possible to identify the
varying purposes of the JMP
instructions at addresses 13, 15 and 17.
For illustrational purposes, here are the results of running LuaDec on the
bytecodes generated by the Lua demos included in the test/
subdirectory
of the official Lua distribution, as of version 0.4.
bisect.lua - works |
cf.lua - works |
echo.lua - works |
env.lua - works |
factorial.lua - works |
fibfor.lua - works |
fib.lua - works |
globals.lua - works |
hello.lua - works |
life.lua - works |
luac.lua - works |
printf.lua - works |
readonly.lua - works |
sieve.lua - works |
sort.lua - works |
table.lua - works |
trace-calls.lua - |
trace-globals.lua - works |
undefined.lua - works |
xd.lua - works |
Running it
To try out LuaDec:
make bin/luac test/life.lua bin/luadec luac.out > newlife.lua bin/lua newlife.lua
LuaDec includes a -d
flag, which displays the progress
of the symbolic interpretation: for each bytecode, the variable stacks
and the code generated so far are shown.
Operation modes
In its default operation mode, LuaDec recursively processes the program functions, starting from the outmost chunk ("main"). When LuaDec detects a decompilation error (by doing sanity checks on its internal data structures), Luadec outputs the portion of the function decompiled so far, and resumes decompilation in the outer closure.
There is an alternative operation mode, set with the -f
flag,
where all functions are decompiled separately. This does not generate a .lua file
structured like the original program, but it is useful in the cases where a
decompilation error makes a closure declaration "unreachable" in the default
operation mode. This allows a larger portion of the sources to be restored in
the event of errors.
Download and Feedback
To download LuaDec, and send any comments, questions or bug reports, please go to the LuaForge page.