Optimze perfomance

Hi

I have stiched this small code together, just for playing with codea.

I wonder if there is a way to optimize the code, as it gets quite ‘heavy’ rather fast.

I think it is the quite complex form of the physics.body that makes things slow down.

I guess a solution would be separating the ‘body’ and ‘drawing’ and make a less complex physics model?



--# Main
function setup()
    noSmooth()
    supportedOrientations(PORTRAIT)
    displayMode(FULLSCREEN)
    lineCapMode(PROJECT)
    l = {}
    count = 0
    strokeWidth(1)
    edge1 = physics.body(EDGE, vec2(0,0), vec2(WIDTH, 0))
    edge2 = physics.body(EDGE, vec2(0,0), vec2(0, HEIGHT))
    edge3 = physics.body(EDGE, vec2(WIDTH,0), vec2(WIDTH, HEIGHT))
    edge4 = physics.body(EDGE, vec2(0,HEIGHT), vec2(WIDTH, HEIGHT))
    edge1.type = STATIC
    edge2.type = STATIC
    edge3.type = STATIC
    edge4.type = STATIC
    edge1.restitution = .1
    edge2.restitution = .1
    edge3.restitution = .1
    edge4.restitution = .1
end

function draw()
    physics.gravity(Gravity)
    background(176, 208, 32, 255)
    for i in pairs(l) do
        l[i]:draw()
    end
end

function touched(touch)
    if touch.state == ENDED then
        l[count] = logo(CurrentTouch.x, CurrentTouch.y)
        count = count + 1
    end    
end
--# logo
logo = class()

function logo:init(x,y)
    self.l = physics.body(POLYGON, 
        vec2(0,60), 
        vec2(0,160), 
        vec2(40,90),
        vec2(120,130),
        vec2(120,40),
        vec2(110,30),
        vec2(110,110),
        vec2(50,80),
        vec2(50,5),
        vec2(40,0),
        vec2(30,11),
        vec2(30,80),
        vec2(10,115),
        vec2(10,40)
    )
    self.l.x = x
    self.l.y = y
    --self.l.restitution = .9 --restitution
    
end

function logo:draw()
    --fill(255, 255, 255, 255)
    pushMatrix()
    translate(self.l.x, self.l.y)
    rotate(self.l.angle)
    for j in pairs(self.l.points) do
                a =self.l.points[j]
                b =self.l.points[(j % #self.l.points)+1]
                line(a.x, a.y, b.x, b.y)
    end
    popMatrix()
end

@Jorgensen I’m not sure what your having problems with. The program seems to run OK to me. The only problem I had was creating an object close to the top of the screen where the object got hung up in the top edge. When I commented out edge4, the objects fell OK, unless I created them near the top and close to the edge where they got hung up with the side edges. Other than that, it seems to work as coded and is compact for what’s there.

Hi Dave

Thanks for taking your time for testing. I just wondered if it could be somehow optimized as the fps gets low when adding several objects. But that’s properly hardware limitations.

You are right, the objecs has a bad habit getting stucked. But it’s ok with me, I’ll just press the middle of the screen.

Jorgensen

@Jorgensen Now that I know the FPS goes down, I’ll add a FPS routine and play around with it more to see what can be done to speed it up.

Hello @Jorgensen. I think that line() may be expensive. Why not draw the body’s outline only once onto an image (self.image) when the logo is first created and then sprite that image in logo:draw()?

(Update) Scratch that thought. Now I have actually run the code, my guess is that the behaviour does indeed reflect the complex concave shape of the polygons.

Does the following perform any better?



--# Main
supportedOrientations(PORTRAIT)
displayMode(FULLSCREEN)
function setup()
    l = {}
    count = 0
    edge1 = physics.body(EDGE, vec2(0,0), vec2(WIDTH, 0))
    edge2 = physics.body(EDGE, vec2(0,0), vec2(0, HEIGHT))
    edge3 = physics.body(EDGE, vec2(WIDTH,0), vec2(WIDTH, HEIGHT))
    edge4 = physics.body(EDGE, vec2(0,HEIGHT), vec2(WIDTH, HEIGHT))
    edge1.type = STATIC
    edge2.type = STATIC
    edge3.type = STATIC
    edge4.type = STATIC
    edge1.restitution = .1
    edge2.restitution = .1
    edge3.restitution = .1
    edge4.restitution = .1
end

function draw()
    physics.gravity(Gravity)
    background(176, 208, 32)
    for i in pairs(l) do
        l[i]:draw()
    end
end

function touched(touch)
    if touch.state == ENDED then
        l[count] = logo(CurrentTouch.x, CurrentTouch.y)
        count = count + 1
    end    
end
--# logo
logo = class()

function logo:init(x,y)
    local shapeFull = {
        vec2(0,60), 
        vec2(0,160), 
        vec2(40,90),
        vec2(120,130),
        vec2(120,40),
        vec2(110,30),
        vec2(110,110),
        vec2(50,80),
        vec2(50,5),
        vec2(40,0),
        vec2(30,11),
        vec2(30,80),
        vec2(10,115),
        vec2(10,40)
    }
    local shapeSimple = {
        vec2(0,60), 
        vec2(0,160), 
        vec2(40,90),
        vec2(120,130),
        vec2(120,40),
        vec2(110,30),
        vec2(50,5),
        vec2(40,0),
        vec2(30,11),
        vec2(10,40)
    }
    self.l = physics.body(POLYGON, unpack(shapeSimple))
    self.l.x = x
    self.l.y = y
    self.image = image(121, 161)
    setContext(self.image)
    lineCapMode(PROJECT)
    stroke(255)
    strokeWidth(1)
    noSmooth()
    for j = 1, #shapeFull do
        a = shapeFull[j] + vec2(1, 1)
        b = shapeFull[(j % #shapeFull) + 1] + vec2(1, 1)
        line(a.x, a.y, b.x, b.y)
    end
    setContext()
end

function logo:draw()
    pushMatrix()
    pushStyle()
    resetMatrix()
    spriteMode(CORNER)
    noSmooth()
    translate(self.l.x, self.l.y)
    rotate(self.l.angle)
    sprite(self.image, 0, 0)
    popMatrix()
    popStyle()
end

Performance can be dependant on the shape of the polygon. Box2D only allows convex polygons and so Codea has to decompose concave polygons into a number of convex polygons. This is why objects created near the edge objects can get stuck because they can get jammed in between two convex polygons within the same object. Performance wise, once you start adding several dosen objects colliding with each other you usually see performance start to degrade.

This is kind of brute force, but the FPS remained high when I created a lot of objects. I put several lines of code on 1 line to shorten the amount of code space here. I put comments in where I added or changed code.


--# Main
function setup()
    noSmooth()
    supportedOrientations(PORTRAIT)
    displayMode(FULLSCREEN)
    lineCapMode(PROJECT)
    l = {}
    count = 0
    strokeWidth(1)
    edge1 = physics.body(EDGE, vec2(0,0), vec2(WIDTH, 0))
    edge2 = physics.body(EDGE, vec2(0,0), vec2(0, HEIGHT))
    edge3 = physics.body(EDGE, vec2(WIDTH,0), vec2(WIDTH, HEIGHT))
    edge4 = physics.body(EDGE, vec2(0,HEIGHT), vec2(WIDTH, HEIGHT))
    edge1.type = STATIC
    edge2.type = STATIC
    edge3.type = STATIC
    edge4.type = STATIC
    edge1.restitution = .1
    edge2.restitution = .1
    edge3.restitution = .1
    edge4.restitution = .1
    
    -- added these 4 lines
    p1=vec2(0,60) p2=vec2(0,160) p3=vec2(40,90) p4=vec2(120,130)
    p5=vec2(120,40) p6=vec2(110,30) p7=vec2(110,110) p8=vec2(50,80)
    p9=vec2(50,5) p10=vec2(40,0) p11=vec2(30,11) p12=vec2(30,80)
    p13=vec2(10,115) p14=vec2(10,40)
end

function draw()
    physics.gravity(Gravity)
    background(176, 208, 32, 255)
    for i in pairs(l) do
        l[i]:draw()
    end
end

function touched(touch)
    if touch.state == ENDED then
        l[count] = logo(CurrentTouch.x, CurrentTouch.y)
        count = count + 1
    end    
end

logo = class()

function logo:init(x,y)
    self.l = physics.body(POLYGON, vec2(0,60),vec2(0,160),vec2(40,90),
        vec2(120,130),vec2(120,40),vec2(110,30),vec2(110,110),
        vec2(50,80),vec2(50,5),vec2(40,0),vec2(30,11),
        vec2(30,80),vec2(10,115),vec2(10,40)
    )
    self.l.x = x
    self.l.y = y
end

function logo:draw()
    local line=line     -- added this line
    pushMatrix()
    translate(self.l.x, self.l.y)
    rotate(self.l.angle)
    
    -- added these lines to avoid the for loop
    line(p1.x,p1.y,p2.x,p2.y)
    line(p2.x,p2.y,p3.x,p3.y)
    line(p3.x,p3.y,p4.x,p4.y)
    line(p4.x,p4.y,p5.x,p5.y)
    line(p5.x,p5.y,p6.x,p6.y)
    line(p6.x,p6.y,p7.x,p7.y)
    line(p7.x,p7.y,p8.x,p8.y)
    line(p8.x,p8.y,p9.x,p9.y)
    line(p9.x,p9.y,p10.x,p10.y)
    line(p10.x,p10.y,p11.x,p11.y)
    line(p11.x,p11.y,p12.x,p12.y)
    line(p12.x,p12.y,p13.x,p13.y)
    line(p13.x,p13.y,p14.x,p14.y)
    line(p14.x,p14.y,p1.x,p1.y)
    
    -- removed the for loop
    popMatrix()
end

One thing that can make a surprising amount of difference.


function draw()
    local i
    physics.gravity(Gravity)
    background(176, 208, 32, 255)
    for i in pairs(l) do
        l[i]:draw()
    end
end

Hello @Mark. That is an interesting comment. Could you elaborate? Why does creating a local variable i outside of the generic for loop affect the speed of the for loop? My understanding is the i that exists within the scope of the loop is unrelated to the i outside the loop’s scope…

How about using an array for l, rather than a generic table, and caching the logo.draw function? (Update) And use the no argument version of background()?




--# Main
supportedOrientations(PORTRAIT)
displayMode(FULLSCREEN)
function setup()
    l = {}
    count = 0
    edge1 = physics.body(EDGE, vec2(0,0), vec2(WIDTH, 0))
    edge2 = physics.body(EDGE, vec2(0,0), vec2(0, HEIGHT))
    edge3 = physics.body(EDGE, vec2(WIDTH,0), vec2(WIDTH, HEIGHT))
    edge4 = physics.body(EDGE, vec2(0,HEIGHT), vec2(WIDTH, HEIGHT))
    edge1.restitution = .1
    edge2.restitution = .1
    edge3.restitution = .1
    edge4.restitution = .1
    background(176, 208, 32)
end

function draw()
    physics.gravity(Gravity)
    background()
    local d = logo.draw
    for i = 1, count do
        d(l[i])
    end
end

function touched(touch)
    if touch.state == ENDED then
        count = count + 1
        l[count] = logo(CurrentTouch.x, CurrentTouch.y)
    end    
end
--# logo
logo = class()

function logo:init(x,y)
    local shapeFull = {
        vec2(0,60), 
        vec2(0,160), 
        vec2(40,90),
        vec2(120,130),
        vec2(120,40),
        vec2(110,30),
        vec2(110,110),
        vec2(50,80),
        vec2(50,5),
        vec2(40,0),
        vec2(30,11),
        vec2(30,80),
        vec2(10,115),
        vec2(10,40)
    }
    local shapeSimple = {
        vec2(0,60), 
        vec2(0,160), 
        vec2(40,90),
        vec2(120,130),
        vec2(120,40),
        vec2(110,30),
        vec2(50,5),
        vec2(40,0),
        vec2(30,11),
        vec2(10,40)
    }
    self.l = physics.body(POLYGON, unpack(shapeSimple))
    self.l.x = x
    self.l.y = y
    self.image = image(121, 161)
    setContext(self.image)
    lineCapMode(PROJECT)
    stroke(255)
    strokeWidth(1)
    noSmooth()
    for j = 1, #shapeFull do
        a = shapeFull[j] + vec2(1, 1)
        b = shapeFull[(j % #shapeFull) + 1] + vec2(1, 1)
        line(a.x, a.y, b.x, b.y)
    end
    setContext()
end

function logo:draw()
    pushMatrix()
    pushStyle()
    resetMatrix()
    spriteMode(CORNER)
    noSmooth()
    translate(self.l.x, self.l.y)
    rotate(self.l.angle)
    sprite(self.image, 0, 0)
    popMatrix()
    popStyle()
end

On an iPad 2, my last attempt above runs at 60 fps with the Viewer chock full of logos.

@mpilgrem With your latest code, the pieces don’t seem to interact like the pieces in the original code. In the original code, the pieces can intermingle with each other. In your code, the pieces don’t. I think it’s the shapeSimple that’s causing this.

Hello @dave1707. Yes, part of my optimisation is to implement what @Jorgensen suggested in his or her initial post, and make the Physics polygon less concave than the associated logo polygon.

@mpilgrem you’re right that introducing a variable in this way doesn’t give it global scope. However, I’ve found that adding explicit local scope for even a single variable can make a 1-2% difference in performance. If you have a lot of such values (using ipairs, loop variables can really proliferate) it can easily make a noticeable difference.

Hi all

Thank you all for your contributions. I’ll have to figure out how to add a FPS counter so I can play around with the code :slight_smile:

As I thought, the complicated physics.body form eats a lot frames.

I think I have to try out @mpilgrem’s suggestions.

Thanks everyone

Just add the FPS function and in function draw() add a FPS call with the x,y location you want it to show at. This shows frames per second with 2 decimal places.


function draw()
     FPS(300,400)
end


function FPS(x,y)
    fill(255)
    str=string.format("%.2f",1/DeltaTime)
    text(str,x,y)
end

If you are interested you can use my FPS() class: it shows

  • the average fps and
  • the minimum in the last 2 seconds
    The minimum is important because when you have 1 long frame from time to time it doesnt show in the average but it is very visible.
    To use my class:
  • declare fps=FPS() in the setup().
  • put fps:draw() in the draw().
    Thats it!
    There is a progress bar included, you may use it too, but you can just ignore it, it will not show up.
FPS = class()

-- this manages FPS and a progress bar
function FPS:init()
    -- average fps
    self.val = 60
    self.t0 = ElapsedTime
    -- min fps
    self.min = 60
    self.t1 = ElapsedTime
    -- progress bar
    self.frac = 0
    self:progressBarInit()
end

function FPS:draw()
    local vShift = 0
    if self.progressBarActive then vShift = 30 end
    -- update FPS value with some smoothing
    local old = self.val
    local frac = 0.1
--    local t1 = os.clock()
    local delta = DeltaTime
    local new = 1/delta or old
    if new<self.min then self.min=new; self.t1=ElapsedTime+1 end
    if self.t1<ElapsedTime then self.min=60 end
    new = old*(1-frac)+ new*frac
    self.val = new

    -- write the FPS on the screen
    pushStyle()
    fill(208, 208, 208, 255)
    fontSize(20)
    font("AmericanTypewriter-Bold")
    rectMode(CENTER)
    text(math.floor(new).." fps (> "..math.floor(self.min)..")",70,HEIGHT-15-vShift)
    popStyle()
    -- draw progress bar
    self:progressBarDraw()
end


function FPS:progressBarInit(txt)
    self.frac = 0
    self.txt = txt or "running"
    self.img = self:progressBarCalcInfoImg(self.txt,WIDTH*0.19,30,"top")
    self.progressBarActive = false
end

function FPS:progressBarUpdate(frac)
    self.frac = frac
    if frac>0 and frac<1 then self.progressBarActive = true
    else self.progressBarActive = false end
end

-- image to show job progress
function FPS:progressBarCalcInfoImg(txt,w,h,rot)
    local w0,h0
    pushStyle() pushMatrix()
    if rot=="left" or rot=="right" 
    then w0,h0 = h,w
    else w0,h0 = w,h
    end
    local img0 = image(w0,h0)
    setContext(img0)
        font("AmericanTypewriter-Bold")
        rectMode(CENTER)
        textMode(CENTER)
        strokeWidth(1)
        background(255, 255, 255, 255)
        fill(0, 0, 0, 255)
        stroke(0, 0, 0, 255)
        fontSize(20)
        text(txt,w0/2,h0/2)
    setContext()
    local img = image(w,h)
    setContext(img)
        background(0, 0, 0, 255)
        spriteMode(CENTER)
        translate(w/2,h/2)
        if rot=="left" then rotate(-90) end
        if rot=="right" then rotate(90) end
        sprite(img0,0,0)
    setContext()
    popStyle() popMatrix()
    return img
end

function FPS:progressBarDraw()
    if self.progressBarActive then
    local img = self.img
    pushStyle()
        spriteMode(CORNER)
            tint(128, 128, 128, 255)
            sprite(img, 0, HEIGHT - img.height) 
            tint()
            tint(255, 255, 255, 255)
            clip(0,HEIGHT  - img.height,self.frac*img.width,img.height)
            sprite(img, 0, HEIGHT - img.height) 
            clip()
    popStyle()
    end
end

Hello @Mark. I have not been able to demonstrate what you suggest. The results of my test code below show no consistent improvement:


function setup()
    local clock = os.clock
    
    local startTime1 = clock()
    for i = 1, 10000000 do
        local k1 = i
    end
    local endTime1 = clock()

    local j -- This time create a local variable with the same name
    local startTime2 = clock()
    for j = 1, 10000000 do
        local k2 = j
    end
    local endTime2 = clock()
    
    print(endTime1 - startTime1)
    print(endTime2 - startTime2)
end

function draw() background(0) end

Update: Also, on a PC version of Lua, I examined the bytecode produced by luac. I can see no reason why local j would cause the for loop to run quicker - it has no effect on the bytecode for the loop.

Hi jmv

Thanks.

How do you activate the progress bar? :slight_smile:

And also thanks to you David for a nice an clean example.