It looks like you're new here. If you want to get involved, click one of these buttons!
I rewrote some code using vectors, by applying Codea's vec2
userdata type - and it seemed to run much slower as a result. That is not what I had expected. The code below explores that further:
-- -- Codea's vec2 userdata -- function setup() local n = 100000 local d local v1 = vec2(1, 2) local v2 = vec2(4, 5) local v1x = v1.x local v1y = v1.y local v2x = v2.x local v2y = v2.y local tb1 = {x=1, y=2} local tb2 = {x=4, y=5} print("Vectors - minus and len") t1 = os.clock() d = 0 for i = 1, n do local v3 = v2 - v1 d = d + v3:len() end dt1 = os.clock() - t1 print("Result:"..d) print(dt1) print() print("Vectors - partial") t2 = os.clock() d = 0 for i = 1, n do local v3x = v2.x - v1.x local v3y = v2.y - v1.y d = d + math.sqrt(v3x*v3x + v3y*v3y) end dt2 = os.clock() - t2 print("Result:"..d) print(dt2) print("Saving (%):", (1 - dt2/dt1)*100) print() print("Vectors - dist") t3 = os.clock() d = 0 for i = 1, n do d = d + v2:dist(v1) end dt3 = os.clock() - t3 print("Result:"..d) print(dt3) print("Saving (%):", (1 - dt3/dt1)*100) print() print("Tables") t4 = os.clock() d = 0 for i = 1, n do local v3x = tb2.x - tb1.x local v3y = tb2.y - tb1.y d = d + math.sqrt(v3x*v3x + v3y*v3y) end dt4 = os.clock() - t4 print("Result:"..d) print(dt4) print("Saving (%):", (1 - dt4/dt1)*100) print() print("Pure number types") t5 = os.clock() d = 0 for i = 1, n do local v3x = v2x - v1x local v3y = v2y - v1y d = d + math.sqrt(v3x*v3x + v3y*v3y) end dt5 = os.clock() - t5 print("Result:"..d) print(dt5) print("Saving (%):", (1 - dt5/dt1)*100) end function draw() background(0) end
On my iPad2, this gives the following output:
Vectors - minus and len Result:424760 0.560913 Vectors - partial Result:424760 0.409302 Saving (%): 27.0294 Vectors - dist Result:424760 0.261353 Saving (%): 53.4059 Tables Result:424760 0.111877 Saving (%): 80.0544 Pure number types Result:424760 0.0709839 Saving (%): 87.3449
It seems that vec2
comes at a price, the cost being speed.
Comments
Thanks for doing this, I've wondered about the performance implications of using vec2. Since Lua allows multiple return values, I wonder if it would be better to have a set of functions that take and return vectors by their individual components instead. It would probably be a lot less convenient though.
That's really interesting, I want to make some test too because I was pretty sure (and I wrongly never checked) that vec2 math should be faster than lua math on generic numbers/tables ecc. also due to some discussion on this forum. Probably something that could make this calc faster using vec2 would be the possibility to not create each time a new vec2 obj (that I fear is the real cause performance problem), like having methods that allows to apply the transformations (like rotate, translate, ecc) directly on the same vec2 or on a vec2 passed as parameter. @Simeon what do you think about @mpilgrem results?
Those are very interesting results. I suspect there may be a lot of overhead when constructing a new userdata type, as well as calling out to C. So for the types of simple calculations you're performing, the overhead outweighs the benefits.
This is the source code for our vec2 implementation (from the Codea Runtime Library): https://github.com/TwoLivesLeft/Codea-Runtime/blob/master/CodeaTemplate/LuaLibs/vec2.c
Perhaps performance would be better if we re-wrote this as a pure-Lua library?
You can vastly improve the performance of the
vec2
benchmarks by locally caching the functions. The biggest slowdown is lookups on the vec2 members.This gives me a saving of ~76% on that particular test.
Caching the function is not ideal, though. But for tight loops, this might be necessary for good performance.
This appears to be due to Lua's
luaL_checkudata
call, which validates the vec2 type for safety before attempting to perform the operation.It's quite a slow call, I'm going to try to find a way to work around this while maintaining a safety check.
Edit: I am able to speed up the built-in vectors so that they are faster than the "Pure number" and table solutions, however Codea could potentially be crashed by passing in an incorrect type (for example, passing a vec2 into a vec4 length function). Unsure whether it would be worth sacrificing stability for speed.
@Simeon maybe add a global setting function vec2check(boolean), set by default to true? When writing and debugging it would be true, and set to false when game is ready?
I have an experimental fix that is still safe to use.
It's still always going to e faster to cache the method calls in locals prior to entering a tight loop, though. But that's just the way Lua is.