As most Lua users and every Metalua users know, Lua makes a difference between expressions and statements, and it's sometimes cumbersome to put a statement where the context expects an expression. This can be addressed in Lua by using a function closure, but at a considerable runtime cost.
Metalua introduces a `Stat{ } AST node which lets metaprogrammers introduce a statement in an expression's context, but no concrete syntax was provided, partly to discourage the use of stats-as-exprs (which often is of poor taste), partly because I couldn't come up with a satisfying syntax. In this post I'll propose a syntax that seems better, if trickier to implement, than anything I've come by until now.
`Stat{ block, expr }
is an expression node, which evaluates the block, then evaluates the expression
in the block's context (therefore having access to its local variables; that's also how "
repeat foo until bar
" works in Lua, where condition "
bar
" has access to "
foo
"'s local variables). The node's expression value is the second child's value. For instance, the following AST, when executed, prints 42:
print( -{ `Stat{ +{block: local x = 21}, +{expr: 2*x} })
I propose to present stat node between keyword "
stat
" ... "
end
". In that block, the last statement should be a return, and its value is the returned expression. The example above would be written:
print(stat local x=21; return 2*x end)
.
This is easily done by analyzing the content of the block and extracting the content of the return statement:
mlp.lexer:add{ "stat" }
mlp.expr:add{ "stat", mlp.block, "end", builder = stat_builder }
function stat_builder(x)
local returned_expr
local walker_cfg = { ... } -- Here the magic happens
walk.block (walker_cfg, x[1])
return `Stat{ x[1], expr }
end
Walking a block to extract returns (which are not inside a nested function) has already been addressed multiple times, e.g.
here. Now what happens when no return or multiple returns are found? If there's no return, it's easy, just return nil. So above, we simply initialize
returned_expr
at
+{nil}
. In case of multiple returns, we could fail at compile time, but we can actually do better, and rewrite this as follows (supposing that
goto
and
label
are supported by the syntax):
print(stat
local x=21;
if foo then return foo else return 2*x end
end)
is rewritten as:
print(stat
local $1=nil
local x=21
if foo then $1=foo; goto 'retval'
else $1=2*x; goto 'retval' end
label 'retval'
return $1
end)
This way, multiple returns are handled gracefully. Moreover, this code has a semantically equivalent (although much less efficient) Lua implementation: "
stat block end
" can be translated into the closure "
((function() block end)())
".
The complete implementation looks like this:
-{ extension 'xmatch' }
require 'walk'
local function stat_builder (x)
local body, result = x[1]
-- Accumulate direct return statements in `return_stats' list
local return_stats = { }
local walker_cfg = { expr={ }, stat={ } }
match function walker_cfg.stat.up
| `Return{ _ } == x -> table.insert (return_stats, x)
| `Return{ } == x -> x[1] = +{nil}; table.insert (return_stats, x)
| `Return{...} -> error "No multi-returns in stat blocks!"
| _ ->
end
match function walker_cfg.expr.down
| `Function{...} -> return 'break' | _ ->
end
walk.block (walker_cfg, x[1])
match return_stats with
-- No return statement -> resulting value is nil, don't change the body
| { } -> result = +{ nil }
-- One return statement at the end -> just extract it
| { ret } if body[#body] == ret -> result = table.remove (body) [1]
-- Several returns -> use variable and goto statements
| _ ->
local retvar = mlp.gensym 'stat_value'
local label = mlp.gensym 'end_of_stat'
table.insert (body, 1, `Local{ {retvar}, { } })
table.insert (body, `Label{ label })
for ret in ivalues (return_stats) do
ret.tag = nil
ret <- { `Set{ {retvar}, {ret[1]} }, `Goto{ label } }
end
result = retvar
end
return `Stat{ body, result }
end
mlp.lexer:add{ "stat" }
mlp.expr:add{ "stat", mlp.block, "end", builder = stat_builder }
(I've used the xmatch extension described
in a previous post).
Notice, if you find that implementation too verbose, that it performs some operations more complex than most macros found in other languages: code analysis, different replacements depending on the code structure, (manual) hygiene within a Lisp-1 context... The simplest possible implementation, not supporting advanced code walking, would have been:
function stat_builder (x)
local body, result = x[1], +{ nil }
if body[#body].tag == 'Return' then
result = table.remove (body) [1]
end
return `Stat{ body, result }
end
mlp.lexer:add{ "stat" }
mlp.expr:add{ "stat", mlp.block, "end", builder = stat_builder }
The only limitation of the complete version is that multiple returns inside a stat blocks aren't supported. I'd rather keep it that way: the implementation of
`Stat{ }
is already a hack, since Lua
really dislikes statements-as-expressions, at low level as well as at the syntax level. Making it multiple-values compatible would be horribly messy for little added value.
However, it could be handled through rewriting: when at least one return in the block might have multiple values (that means, all returns with a comma, and all returns of a function call), we can:
- Change all "
return foo
" statements into "return {foo}
". - Change the default return value from
+{nil}
to +{{}}
- Surround the
`Stat{ }
node with a call to unpack()
.
Definitely not worth the kludge, nor the performance hit when treating a statement of the form "
return f()
", in my opinion. However, just because I can, here is the code:
-{ extension 'xmatch' }
require 'walk'
local multireturn_tags = table.transpose{ 'Dots', 'Call', 'Invoke' }
local function stat_builder (x)
local body, result = x[1]
-- Accumulate direct return statements in `return_stats' list
local multireturn = false
local return_stats = { }
local walker_cfg = { expr={ }, stat={ } }
match function walker_cfg.stat.up
| `Return{...} == x ->
if #x>1 or multireturn_tags[x[#x].tag] then multireturn = true end
table.insert (return_stats, x)
| _ ->
end
match function walker_cfg.expr.down
| `Function{...} -> return 'break' | _ ->
end
walk.block (walker_cfg, x[1])
match return_stats with
-- No return statement -> resulting value is nil, don't change the body
| { } -> result = +{ nil }; assert(not multireturn)
-- One return statement at the end -> just extract it
| { ret } if body[#body] == ret ->
if multireturn then
result = table.remove (body)
result.tag = 'Table' -- change return stat into list
else
result = table.remove (body) [1]
end
-- Several returns -> use variable and goto statements
| _ ->
local retvar = mlp.gensym 'stat_value'
local label = mlp.gensym 'end_of_stat'
table.insert (body, 1, `Local{ {retvar}, { multireturn and `Table } })
table.insert (body, `Label{ label })
for ret in ivalues (return_stats) do
local this_result
if multireturn then
this_result = table.shallow_copy(ret)
this_result.tag = 'Table'
else
this_result = ret[1]
end
ret.tag = nil
ret <- { `Set{ {retvar}, {this_result} }, `Goto{ label } }
end
result = retvar
end
if multireturn then
return `Call{ `Id "unpack", `Stat{ body, result } }
else
return `Stat{ body, result }
end
end
mlp.lexer:add{ "stat" }
mlp.expr:add{ "stat", mlp.block, "end", builder = stat_builder }