LuaDec, Lua bytecode decompiler

Overview

LuaDec is a decompiler for the Lua language. It takes compiled Lua bytecodes and attempts to produce equivalent Lua source code on standard output. It targets Lua 5.0.2.

The decompiler performs two passes. The first pass is considered a "preliminary pass". Only two actions are performed: addresses of backward jumps are gathered for while constructs, and variables referenced by CLOSE instructions are taken note of in order to identify the point where explicit do blocks should be opened.

The actual decompilation takes place in the second, "main pass". In this pass, the instructions are symbolically interpreted. As functions are traversed recursively, the symbolic content of registers are kept track of, as well as a few additional data structures: two register stacks, one for "variable registers" (registers currently allocated for variables) one for "temporary registers" (registers holding temporary values); and a list of boolean conditions.

Instructions like ADD and MOVE combine the string representation of symbols and move them around in registers. Emission of actual statements is delayed until the temporaries stack is empty, so that constructs like a,b=b,a are processed correctly.

The list of boolean conditions accumulate pairs of relational operators (or TESTs and jumps). The translation of a list into a boolean expression is done in three stages. In the first stage, jump addresses are checked to identify "then" and "else" addresses and to break the list into smaller lists in the case of nested if statements. In the second stage, the relations between the jumps is analyzed serially to devise a parenthization scheme. In the third scheme, taking into account the parenthization and an 'inversion' flag (used to tell "jump-if-true" from "jump-if-false" expressions), the expression is printed, recursively.

Two forms of "backpatching" are used in the main processing pass: boolean conditions for while constructs are inserted in the marked addresses (as noted in the first pass), and do blocks are added as necessary, according to the liveness information of local variables.

LuaDec is written by Hisham Muhammad, and is licensed under the same terms as Lua. Windows binaries contributed by Fabio Mascarenhas.

Status

LuaDec, in its current form, is not a complete decompiler. It does succeed in decompiling most of the Lua constructs, so it could be used as a tool to aid in retrieving lost sources.

The situations where LuaDec "get it wrong" are usually related to deciding whether a sequence of relational constructs including TEST operators are part of an assignment or an if construct. Also, the "single pass" nature of the symbolic interpreter fails at some corner cases where there simply is not enough information in the sequence of operator/jump pairs to identify what are the "then" and "else" addresses. This is an example of such a case:

1       [2]     LOADNIL         0 2 0
2       [3]     JMP             0 16    ; to 19
3       [4]     EQ              0 1 250 ; - 2
4       [4]     JMP             0 2     ; to 7
5       [4]     TEST            2 2 1
6       [4]     JMP             0 5     ; to 12
7       [4]     EQ              0 1 251 ; - 3
8       [4]     JMP             0 3     ; to 12
9       [4]     LOADK           3 2     ; 1
10      [4]     TEST            3 3 1
11      [4]     JMP             0 0     ; to 12
12      [6]     LOADK           0 2     ; 1
13      [7]     JMP             0 7     ; to 21
14      [8]     LOADK           0 0     ; 2
15      [8]     JMP             0 3     ; to 19
16      [10]    LOADK           0 1     ; 3
17      [11]    JMP             0 3     ; to 21
18      [12]    LOADK           0 4     ; 4
19      [3]     TEST            1 1 1
20      [3]     JMP             0 -18   ; to 3
21      [14]    RETURN          0 1 0
local a, x, y
while x do
   if ((x==2) and y) or ((x==3) and 1) or 0

   then
      a = 1
   do break end
      a = 2
   else
      a = 3
   do break end
      a = 4
   end
   a = 5
end

Notice that there is no reference to the "else" address in the highlighted sequence. Only with a more sophisticated block analysis it would be possible to identify the varying purposes of the JMP instructions at addresses 13, 15 and 17.

For illustrational purposes, here are the results of running LuaDec on the bytecodes generated by the Lua demos included in the test/ subdirectory of the official Lua distribution, as of version 0.4.

bisect.lua - works
cf.lua - works
echo.lua - works
env.lua - works
factorial.lua - works
fibfor.lua - works
fib.lua - works
globals.lua - works
hello.lua - works
life.lua - works
luac.lua - works
printf.lua - works
readonly.lua - works
sieve.lua - works
sort.lua - works
table.lua - works
trace-calls.lua - fails assertion works (fixed in 0.4)
trace-globals.lua - works
undefined.lua - works
xd.lua - works

Running it

To try out LuaDec:

make
bin/luac test/life.lua
bin/luadec luac.out > newlife.lua
bin/lua newlife.lua

LuaDec includes a -d flag, which displays the progress of the symbolic interpretation: for each bytecode, the variable stacks and the code generated so far are shown.

Operation modes

In its default operation mode, LuaDec recursively processes the program functions, starting from the outmost chunk ("main"). When LuaDec detects a decompilation error (by doing sanity checks on its internal data structures), Luadec outputs the portion of the function decompiled so far, and resumes decompilation in the outer closure.

There is an alternative operation mode, set with the -f flag, where all functions are decompiled separately. This does not generate a .lua file structured like the original program, but it is useful in the cases where a decompilation error makes a closure declaration "unreachable" in the default operation mode. This allows a larger portion of the sources to be restored in the event of errors.

Download and Feedback

To download LuaDec, and send any comments, questions or bug reports, please go to the LuaForge page.