Simeon is looking at increasing vector speed performance in Codea, but it is complex.
If you know enough about C to help with the challenge below, please message Simeon.
He says …
I spent some time looking at solutions for removing the need for allocations on vector objects.
The problem is that vec2 (and friends) are Lua “user data” objects, which are basically Lua-managed interfaces to C code and data. The problem is that performing operations on them needs to return a result, which needs to ask Lua to create a new user data. This causes Lua to allocate memory with its memory allocator. And that’s the source of the performance difference.
Out of curiosity, I wrote a pure-Lua vec2 implementation and tested it on your benchmark. It was extremely slow (1000 iterations/frame vs. 8000 for user data).
There are only a few solutions I can think of, and some of them seem a bit infeasible.
- Modify the Lua interpreter to support native vector types. This would be pretty tricky and make keeping up-to-date with future versions of Lua quite hard.
- Find a faster memory allocator. I tried using LuaJIT’s memory allocator with Lua and it seems a little faster, though not much. I’m pretty sure the generalised malloc/realloc/free are as fast as they get.
- Use a custom memory allocator that draws from a static pool rather than the heap. This would require some fixed memory size from which vectors can be drawn. The problem is you’d probably want to make this explicit in the code, or you’ll run out of pool memory in general purpose applications.
– Do expensive vector loop, only 20 vectors can be allocated before old ones are reused
This is hard to implement in a bug-free manner though.
Use a vector-expression API where you pre-construct the expression tree then evaluate the expression with vectors as arguments, e.g:
local expr = vec2(“someVec + someOther * (radius * 2.6) - pos”)
– In the loop
local v = expr:evaluate(a, b, c, d, e)
This is a bit clunky though. And it still allocates at least one new vector (v)