-
Notifications
You must be signed in to change notification settings - Fork 42
How about move Yue auto generated anonymous function to upvalue with name prefix '__' #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It is a brilliant idea! And it seems the TypescriptToLua compiler is doing similar optimizations too.
f = ->
func if cond
print 123
true
else
false is generating to: local _anon_func_0 = function(cond, print)
if cond then
print(123)
return true
else
return false
end
end
local f
f = function()
-- passing the accessed global variable "print" from the call site
return func(_anon_func_0(cond, print))
end
onEvent "start", ->
-- the "with" expression below that generating anonymous function can be optimized
gameScene\addChild with ScoreBoard!
gameScore = 100
\schedule (deltaTime) ->
.updateScore gameScore compiles to: local _anon_func_0 = function(ScoreBoard)
local _with_0 = ScoreBoard()
local gameScore = 100
_with_0:schedule(function(deltaTime)
return _with_0.updateScore(gameScore)
end)
return _with_0
end
onEvent("start", function()
return gameScene:addChild(_anon_func_0(ScoreBoard))
end) But in another case: onEvent "start", ->
gameScore = 100
-- the "with" expression below can not be optimized due to capturing the upvalue "gameScore"
gameScene\addChild with ScoreBoard!
\schedule (deltaTime) ->
.updateScore gameScore compiles to: onEvent("start", function()
local gameScore = 100
return gameScene:addChild((function()
local _with_0 = ScoreBoard()
_with_0:schedule(function(deltaTime)
return _with_0.updateScore(gameScore)
end)
return _with_0
end)())
end)
-- the case to optimize
exe_func = (func, env) ->
ok, ... = try
debug_env_before(env)
func(env)
debug_env_after(env)
catch ex
-- accessed both 'ex' and 'error'
error ex
return ex
if ok
return ...
else
os.exit(1) compiles to: local _anon_func_0 = function(os, _arg_0, ...)
do
local ok = _arg_0
if ok then
return ...
else
return os.exit(1)
end
end
end
local _anon_func_1 = function(debug_env_after, debug_env_before, env, func)
do
debug_env_before(env)
func(env)
return debug_env_after(env)
end
end
local exe_func
exe_func = function(func, env)
-- get no way to pass the global variable 'error'
-- so we have to keep this anonymous callback function below
return _anon_func_0(os, xpcall(_anon_func_1, function(ex)
error(ex)
return ex
end, debug_env_after, debug_env_before, env, func))
end
|
Yes! There is the case that Yue does not call the function itself, but if function require access and modify the closure, then we can not do anything about it. |
Maybe I missed something important, but I don't see this being anywhere near the problem it is presented as here. I made a test script based on the code provided in the initial comment: https://gist.github.com/SkyyySi/dce94707e15c1f5c304285cf9c524abc My results were that the outlined function, while slightly faster to be fair, didn't provide a significant difference:
And that shouldn't really be a surprise because closures do not create a new function each time that they are evaluated! Lua code only gets compiled once before launching it. Subsequently, closures behave more like a struct, bundling a function pointer with an array of argument pointers. That's why it looks like a different object each time. But all you do is setting a pointer. Of course, now that this optimization is already here, you may as well keep it, but please always test your assumptions properly before optimizing. |
Thank you, @SkyyySi, for sharing your benchmarking code and insights. It provided a solid starting point for understanding the performance differences between Lua closures and functions. However, after testing your code, I noticed a few issues and wanted to share some refined benchmarking results for clarity. Observations on Your Benchmark Code
Revised Benchmark CodeTo address these issues, I created updated benchmarks that focus more directly on the performance differences. Below are the two refined test cases: Benchmark 1: Closure vs Function in a Controlled Context-- benchmark.lua
local function benchmark(name, func, ...)
io.write(name, " --> ")
io.flush()
collectgarbage()
local time_start = os.clock()
for i = 1, 100000 do
func(...)
end
collectgarbage()
local time_finish = os.clock()
print(string.format(
"took %.03fs",
time_finish - time_start
))
end
local debug_env_before = function(env) end
local debug_env_after = function(env) end
local env_shared = {}
local func_shared = function(env)
local result = 1
for i = 1, 100 do
result = result * i
end
return result
end
-- Using closure
do
local exe_func
exe_func = function(func, env)
return (function(_arg_0, ...)
local ok = _arg_0
if ok then
return ...
else
return --os.exit(1)
end
end)(xpcall(function()
debug_env_before(env)
func(env)
return debug_env_after(env)
end, function(ex)
error(ex)
return ex
end))
end
benchmark("Using closure", exe_func, func_shared, env_shared)
end
-- Using function
do
local __exe_func__stub_0 = function(_arg_0, ...)
local ok = _arg_0
if ok then
return ...
else
return --os.exit(1)
end
end
local __exe_func__stub_1 = function(func, env)
debug_env_before(env)
func(env)
return debug_env_after(env)
end
local __exe_func__stub_2 = function(ex)
error(ex)
return ex
end
local exe_func
exe_func = function(func, env)
return __exe_func__stub_0(xpcall(__exe_func__stub_1, __exe_func__stub_2, func, env))
end
benchmark("Using function", exe_func, func_shared, env_shared)
end
-- Results:
-- With Lua 5.4:
-- Using closure --> took 0.118s
-- Using function --> took 0.075s
--
-- With LuaJIT:
-- Using closure --> took 0.040s
-- Using function --> took 0.018s Benchmark 2: Simplified Closure vs Function Comparison-- benchmark2.lua
local function benchmark(name, func)
io.write(name, " --> ")
io.flush()
collectgarbage()
local time_start = os.clock()
for i = 1, 100000 do
func()
end
collectgarbage()
local time_finish = os.clock()
print(string.format(
"took %.03fs",
time_finish - time_start
))
end
local function using_closure()
local result = 1
for i = 1, 100 do
result = (function()
return result * i
end)()
end
return result
end
local operation = function(acc, i)
return acc * i
end
local function using_function()
local result = 1
for i = 1, 100 do
result = operation(result, i)
end
return result
end
benchmark("Using closure", using_closure)
benchmark("Using function", using_function)
-- Results:
-- With Lua 5.4:
-- Using closure --> took 1.722s
-- Using function --> took 0.249s
--
-- With LuaJIT:
-- Using closure --> took 0.828s
-- Using function --> took 0.011s Key Insights
ConclusionThank you again for sharing your perspective! I hope this reply clarifies the differences and provides a more precise comparison. While the optimization may not always yield significant gains, it does have merit in specific contexts, particularly for performance-sensitive applications. Let me know if you have further questions or thoughts! |
Well... This is my problem, then let me share my view: The performance gain you get from remove the closures is not that significant in most cases. However, the performance loss from closures can be significant depending on your code context, especially in my case. My Experience:
But, in many case, the closures can not be avoided when Yuescript compile to Lua, so you have to deal with it as well. So let's see what the closure really is in Lua. About Statement: As I know:
Accurate Parts:
Better Analogy:
Closure process only instantiates, Lua must:
About Impact Performance:
Example: COUNT = 1
COUNT_STEP = 100000
benchmark = (name, func, ...) ->
name = name or "<anonymous>"
-- io.write(name, " --> ")
-- io.flush()
time_start = os.clock()
for _ = 1, COUNT
func(...)
time_finish = os.clock()
time_process = time_finish - time_start
print(string.format(
"%s --> took %.03fs",
name,
time_process
))
return time_process
func_has_closure = () ->
-- Yue automatic create closures
for i = 1, COUNT_STEP
x, y, z = 1, 2, 3
local res
try
res = x + y + z + i
assert(res == x + y + z + i)
func_no_closure = () ->
-- Yue automatic create upvalue function
for i = 1, COUNT_STEP
x, y, z = 1, 2, 3
_, res = try
x + y + z + i
assert(res == x + y + z + i)
time1 = benchmark("has_closure", func_has_closure)
time2 = benchmark("no_closure", func_no_closure)
print("VS: #{time1 / time2} time.")
-- Result with Unity3D+xlua
-- LUA: has_closure --> took 0.086s
-- LUA: no_closure --> took 0.016s
-- LUA: VS: 5.375 time. Conclusion:
|
Yue have some nice features/syntax that make it a joy to work with like existence ?, nil coalescing ??, backcalls, destruct vargs... . But it also has big draw back that make it a bad choice for me to writing performance code with it because it auto create a new function every time a function called, this is a big performance issue. So I intent to avoid using it as much as possible.
It is a big shame that I can not use it when it very nice to have. So I intent to make a proposal to fix this issue.
We can move yue auto generated anonymous function to a upvalue named with prefix '__' in the same scope with parent function so that it with not generate a new function every time parent called, this generate a more better performance.
In many cases, we could send variables of closures as function parameters and get modified variable as function return results.
Well... let take the look at the example below to see what I mean.
To default Lua:
Maybe to Lua without inner function:
Another more complex example:
To Lua with poor performance:
To Lua with better performance:
I may make some mistake in the rush, but I hope you can get the idea.
Well... this solution is not work with all cases but aleast it will save a lot of performance in some cases. And I think it is worth to try.
Thanks and regards.
The text was updated successfully, but these errors were encountered: