I recently discovered that a simple three line function with one normalize and one dot function, could be speeded up over 4x by getting rid of the vectors.
What I mean by that, is breaking vectors up into separate x,y,z (scalar) values, and doing calculations on each separately - and avoiding expensive operations like square roots.
So this
function IsVisible(pos,radius) --pos is vec2
local p=pos+cameraDirection*tangentAdjust
local v=(p-cameraPos):normalize()
return v:dot(cameraDirection)>cosFOV
end
can be made over 4x faster with this
function IsVisible(pos,radius)
local px,py=pos.x,pos.y
local dx,dy=px-camposX,py-camposY
if dx*camdirX+dy*camdirY<0 then return end
local u=radius*tangentAdjust
local ptx,pty=px+camdirX*u-camposX,py+camdirY*u-camposY
local sq=ptx*ptx+pty*pty
local a=ptx*camdirX+pty*camdirY
return a*a>cosFOV2*sq
end
Simeon explains it like this:
The performance difference appears to be due to allocations, every vector mult / sub / add has to allocate a new vector object as Lua user data to return its results. The overhead in the allocations accounts for all the difference in performance.I'm going to look into whether we can come up with an alternate memory allocator for lots of small short-lived objects. Note that this problem exhibits itself because the vectors are short-lived and created / deleted constantly. Using vectors in a more long-term scenario should be totally fine (the overhead will not really be noticeable without lots of operations).