Alternative to Tables?

Is there a better way to store objects and their positions than using a table? In my game, I am using tables to store the particles that are on screen (which can get above 1000), and the game slows down significantly when they’re more than two balls on screen with particles surrounding them. I would like to continue supporting the iPad 2 and 3 and the iPhone 4 and 4s, but the game is just too slow on these devices.

@YoloSwag Tables aren’t the problem. I have a table that contains 300,250 words, or 2,851,000 letters. I can search that whole table from start to finish for a specific combination of letters in the blink of an eye. Tables are very fast, so I would say displaying all the particles is what’s slowing thing down.

@YoloSwag - agreed, tables are not the problem

If you aren’t doing so already, consider using meshes instead of images

You can maybe also fake the particles using noise or something similar. If I know what you’re doing, I can maybe suggest something more specific.

If you are using ellipses for the particles, draw an ellipse onto an image in setup and sprite that image instead of doing ellipse() for every particle. I did a simple test where I drew 10,000 per frame of a particular image, and a separate test where I drew 10,000 per frame of an ellipse of the same size. The images took on average .291 seconds to draw (as in, that was the average DeltaTime) and the ellipses took a massive average of 2.5 seconds to draw. So, that’s something to keep in mind

@Monkeyman32123 That works pretty good. I sprited an ellipses with a size of 20 and I was displaying 10000 of them at 30 FPS. I switched to just drawing 10000 ellipses and the FPS dropped to 7.

@Monkeyman32123 The 10000 sprited ellipses were running at a deltaTime of about .032 . The 10000 ellipses were running about .142 .

If you use a noise texture to fake the particles, you could really speed things up

Thanks guys, I’ll take another look at how I coded my tables because I thought I was using sprited ellipses, but maybe I did it incorrectly.

The other thing to consider is to use a shader to actually render the particles, you can either just pass each particle as a single vertex so each set of triangles in your mesh is actually 3 particles.

Alternatively (and this is just a hypothetical guess), you could store all the particles in a buffer (see the mesh / buffer docs) and then render a full screen quad that then renders the particles using a fragment shader.

As for tables - they are Lua’s only data structure and so the language is really geared to work with them in the most optimum way possible.

It might be worth profiling your code (use ElapsedTime or os.time) to see how long each bit of code actually takes, then focus on the slow parts - simple things like not declaring locals inside loops or avoiding repeated dynamic memory allocation or locally caching references to functions can make a BIG difference in execution speed.

Could anyone please show ( or link to a sample ) how to draw hundreds of images using meshes without box2d?
Unfortunately I’m seeing this very same issue in my code .
Images run just fine up to 70/80 moving items on the screen. At that point fps ruthlessly drops down to 45/50. It takes just a few frames to set back to 60 thou.
I usually create one single mesh, adding one rectangle and setting my objects’ position by looping into a table where I put my (80 or so) objects in.

My workaround so far : capping items respawn amount, but very, very disappointing
:-((

thank you

@Dave1707, I’m glad you tried that as well. I think the great difference in our numbers is because I was drawing 256x256 images and diameter 256 ellipses, and the ellipse shader must slow down a LOT the bigger the ellipse is; so I think it’s safe to say that no matter what the size the fastest option is to sprite the ellipse, but especially so with larger ellipses.

Also for other speed tips, use a mesh and a very thin addRect for lines [so long as you don’t mind them all being flat ended]

if you are going to draw a lot of the same image in one function (or use a lot of the same drawing function), save it to a local variable within that function instead of going for global or self access, as locals are much faster to access than other types. This also applies to any other function you may be calling a lot in a function. So rather than calling math.cos() a million times, do ‘local cos = math.cos’ and use the local cos() function.

For instance, the code

for i = , 10000 do
  local x = math.sin(i)
end

runs 30% slower than this one:

local sin = math.sin
for i = , 10000 do
  local x = sin(i)
end

Same with the ellipse, rect, line, etc. functions. And do the same with images. So instead of spriting global ‘img’ 10000 times do ‘local locimg = img’ and ‘sprite(locimg,x,y)’.

For loops are FAR faster than pairs and ipairs; pairs takes on average 2.2x longer than a numerical for loop, and ipairs takes considerably longer than pairs; so avoid pairs and ipairs but if one MUST be used, try to use pairs.

Tables become very slow when you have many small ones. For instance; a table with three elements requires three rehashings, whereas a table with 1,000,000 elements only requires twenty. So when you can use one large table instead of many small ones, it significantly boosts speed. Additionally; it is much faster to redeclare indices in a table than to set new ones, and when initially creating a table it is much faster to declare things there than to later declare indices. So, for example, say you need a table with only four items, it is over twice as fast to do

a = {true,true,true,true} --or a = {x=1,y=2,z=2,w=2}
a[1],a[2],a[3],a[4] = 5,6,7,8 -- ^ and then a.x,a.y,a.z,a.w = 5,6,7,8

than to do

a = {} 
a[1],a[2],a[3],a[4] = 5,6,7,8 

This trick, oddly enough, does not work if the first line is like this:

a = {[1]=true,[2]=true,[3]=true,[4]=true}

While on the subject of tables, the speed difference between using ‘a = {x=1,y=2}’ and ‘a = {1,2}’ should be noted.

1,000,000 point list like this:

points= {{x=3,y=5} , {x=3,y=3} , {x=5,y=3} ,...}

Takes 95KB of memory, whereas one like this:

points= {{3,5} , {3,3} , {5,3} ,...}

Takes 65 KB; and one like this:

points= {x={3,3,5,...},y={5,3,3,...}}

Takes only 24KB of memory! Wow!

Never declare a variable inside a for loop if it can be avoided
So instead of: ‘for i=0,1000 do local t={1,2,3,4}; … end’ do ‘local t={1,2,3,4}; for i=0,1000 do … end’
And if you do intend to create a table inside a loop at each iteration and the contents depend on the iteration number, remember from earlier that it is faster to redeclare each of the indices than to redeclare the whole table, so make the table outside of the loop but change the contents at each iteration.

Whelp, that’s all folks! Hope these tidbits help!

@Monkeyman32123 - very useful analysis, thank you

Agreed that’s some really useful advice.
Especailly with regards the caching of local functions, this can also work at the module level as opposed to the function level.

It always makes me smile when I see code posted using pairs or ipairs, I’ve always used the “old fashioned” for loop with an index, it avoids all the overhead of calling iterator functions, dealing with return values etc etc - the old fashioned method is more closely aligned with the way loops are executed on a processor and is by far the most efficient way.

For any noobs to the game programming game - the most important thing you can remember is the KISS principal - KEEP IT SIMPLE STUPID! :slight_smile:

@Ignatz you’re welcome, friend!
@Techdojo, I’m rather new to coding but I have made it a habit to try for the most efficient way to do things, and I try to weed out bad coding habits before they develop, and I must say that using for loops instead of ipairs and pairs seems to be one of the most prevalent issues I’ve seen in my studies of efficient coding (speaking of, much thanks to @Ignatz for the link to that book on efficient programming).

Also, for speed, if you like rectangles, here’s the rectangle replacement function I use. It’s quite fast (faster than the normal function in all of my tests, but that doesn’t mean it’s perfect).

EDIT: forgot the rectMode RADIUS

rectangles = {origrect = rect,rectmode = rectMode,msh = mesh(),
[CORNERS] = function(x,y,x2,y2)
    local m = rectangles.msh
    local diffx,diffy,avx,avy = x2-x,y2-y
    avx,avy = x+(diffx)/2,y+(diffy)/2
    m:addRect(avx,avy,diffx,diffy)
    local str = strokeWidth()*2
    if str > 0 then
        m:setRectColor(1,stroke())
        m:addRect(avx,avy,diffx-str,diffy-str)
        m:setRectColor(2,fill())
    else
        m:setRectColor(1,fill())
    end
    m:draw()
    m:clear()
end
,
[CORNER] = function(x,y,x2,y2)
    local m = rectangles.msh
    local diffx,diffy,avx,avy = x2,y2
    avx,avy = x+(diffx)/2,y+(diffy)/2
    m:addRect(avx,avy,diffx,diffy)
    local str = strokeWidth()*2
    if str > 0 then
        m:setRectColor(1,stroke())
        m:addRect(avx,avy,diffx-str,diffy-str)
        m:setRectColor(2,fill())
    else
        m:setRectColor(1,fill())
    end
    m:draw()
    m:clear()
end
,
[CENTER] = function(x,y,x2,y2)
    local m = rectangles.msh
    local diffx,diffy,avx,avy = x2,y2,x,y
    m:addRect(avx,avy,diffx,diffy)
    local str = strokeWidth()*2
    if str > 0 then
        m:setRectColor(1,stroke())
        m:addRect(avx,avy,diffx-str,diffy-str)
        m:setRectColor(2,fill())
    else
        m:setRectColor(1,fill())
    end
    m:draw()
    m:clear()
end,
[RADIUS] = function(x,y,x2,y2)
    local m = rectangles.msh
    local diffx,diffy,avx,avy = x2*2,y2*2,x,y
    m:addRect(avx,avy,diffx,diffy)
    local str = strokeWidth()*2
    if str > 0 then
        m:setRectColor(1,stroke())
        m:addRect(avx,avy,diffx-str,diffy-str)
        m:setRectColor(2,fill())
    else
        m:setRectColor(1,fill())
    end
    m:draw()
    m:clear()
end}
function rectMode(mode)
    rect = rectangles[mode]
    --below line can be commented out if you have no desire to use the original rect function
    rectangles.rectmode(mode)
end

EDIT: And the main to show how it is used:

function setup()
    rectMode(RADIUS)
    fps = 60
end
function draw() 
    background(40, 40, 50)
    strokeWidth(5)
    stroke(255, 0, 0, 255)
    fill(0, 0, 255, 255)
    for i=0,100 do
        --BELOW: rect and originalrect calls set up to be the same in each of the rectModes noted to the right of the call
        --my version is rect() and the original rect function is now rectangles.origrect()
        --comment and uncomment whichever one you want to speed test
        
        --rect(WIDTH/2-50,HEIGHT/2-50,WIDTH/2+50,WIDTH/2+50) --CORNERS
        --rect(WIDTH/2,HEIGHT/2,100,100) --CENTER
        --rect(WIDTH/2-50,HEIGHT/2-50,100,100) --CORNER
        rect(WIDTH/2,HEIGHT/2,50,50) --RADIUS
        --rectangles.origrect(WIDTH/2-50,HEIGHT/2-50,WIDTH/2+50,WIDTH/2+50) --CORNERS
        --rectangles.origrect(WIDTH/2,HEIGHT/2,100,100) --CENTER
        --rectangles.origrect(WIDTH/2-50,HEIGHT/2-50,100,100) --CORNER
        --rectangles.origrect(WIDTH/2,HEIGHT/2,50,50) --RADIUS
    end
    output.clear()
    fps = .9*fps+.1/DeltaTime
    print("FPS: "..math.floor(fps))
end

This is part of a series of things I’m making that take regular codea functions and improve their speed. Tell me if you like it. I’m thinking about making an additional speed boost by replacing the if statement that checks for stroke with a replacement function for stroke and noStroke that change the entire rect function to stroke and noStroke versions respectively. This won’t add a tremendous amount of speed, but I think it will be enough to make a difference; tell me if you agree and I will get to doing it.

@Monkeyman32123 - sounds like a cool series, I’ll be interested in following it.
Don’t forget even a small increase can make a massive difference if you call it like a 1,000,000 times.

I think where drawing is concerned, the fastest way to draw is still to use meshes to batch all your sprite calls and where possible stick to a single tpage atlas.

@Monkeyman32123 I never though that writing a function in Codea to replace the actual Codea function would be faster. I tried your rectangle code and the results I found confirm my beliefs, it’s not faster. I increased your “for” loop from 100 to 1000 and your “rect” call ran at 25 FPS. I ran the “rectangles.origrect” call and that ran at 30 FPS. I’m sure the built in functions have been optimized many times over and the only way to increase the speed of a Codea program is to optimize the code being written, not the functions themselves.

Odd, my tests got completely different results, perhaps I’ve overlooked something, so I would like to see the code you used to test it if you don’t mind. I’ve been known to overlook simple things in the past, and I trust your test more than my own

@Monkeyman32123 I used your code with minor changes to show FPS. I changed the limit of the “for” loop from 100 to 1000. I changed how FPS is calculated so that it shows a true average over the time the code is running by adding “total” and “count” and doing the FPS calculation of “count/total” . All of your other code remains the same. There is no way you can write a function in Codea that will be faster than the built in function unless the built in function is badly written, which I doubt will happen. You might have to adjust the “for” loop limit based on the device your using. I’m running on an iPad Air. Below are the 2 functions that I modified. I also removed the commented code that I didn’t use.


function setup()
    rectMode(RADIUS)
    total=0
    count=0
end

function draw() 
    background(40, 40, 50)
    strokeWidth(5)
    stroke(255, 0, 0, 255)
    fill(0, 0, 255, 255)
    for i=0,1000 do
        rect(WIDTH/2,HEIGHT/2,50,50) --RADIUS
        --rectangles.origrect(WIDTH/2,HEIGHT/2,50,50) --RADIUS
    end
    
    fill(255)
    count=count+1
    total=total+DeltaTime
    text(math.floor(count/total),WIDTH/2,HEIGHT-100)
end

If you are going to claim something is faster, please use a more accurate measurement of fps than the fps = .9*fps + 1/DeltaTime one! Dave1707’s is a far better measurement.

I just ran a test of ipairs versus a for loop using lua on my laptop (a slightly more stable environment, and where I can better control competing resources). ipairs came out ever so slightly ahead of for, with pairs slower than both.

@LoopSpace Were the integer keys consecutive when you ran ipairs? And my numbers on that one were not my own, but rather pulled from a few sources. My research all stated that a for was far faster. Many of the hard numbers were pulled from a document on lua.org. As for the other thing, there was a large argument about the fps = .9*… Thing and many claimed that it worked just as well, so I felt comfortable enough to use it; I will use Dave’s method in the future. And thank you very much for the correction @Dave1707. I thought it was too good to be true myself, but I made the project originally to better learn how built in functions work and was surprised to find that it claimed to be faster. I don’t doubt your numbers by any means (you’re far more experienced than I), rather I thank you for the lesson, I’m always willing to learn something new!