Debugging runtime errors in coroutines and using stack traces

XanDDemoX · June 24, 2013, 6:45pm

I wandered (more blundered ) into something with coroutines which threw me a little for a few minutes (a while ) so I thought I’d write a little about debugging. I couldn’t see something like this on here from google or the custom search so i thought I’d post it but apologies if i am repeating an existing post

So essentially the upshot of my (super fun…) time spent trying to figure out what was going on with my coroutine was: coroutines swallow error messages and won’t print them to the output unless you handle errors and print them to the output yourself

So if you have any potentially error prone code in a coroutine it’ll hopefully save you a few minutes by wrapping it in an xpcall, or pcall to get a stack trace (using debug.traceback()) or at least by knowing the point where the error occurred-ish by printing your own error message

I find debugging is not exactly an exact science and can sometimes be more guessing your way up the call stack until something jumps out at you and hits you in the face with a frying pan so essentially patience is required, you’ll get there eventually :). It can sometimes even require many iterations of commenting one line out, running your application, test and repeat until the culprit (or culprits) are found.

Here’s some examples of using xpcall and pcall to catch errors and get a stack trace when an error occurs

xpcall stack trace - probably quite a long call chain in a real use case.

     local funcWithError = function() return nil_table[1] end
     local r= coroutine.create(function() xpcall(funcWithError,function() print(debug.traceback())  end) end)
     coroutine.resume(r)

-- instance output

stack traceback:
	[string "ClassName = class()..."]:14: in function <[string "ClassName = class()..."]:14> -- first call /entry point here
	[string "ClassName = class()..."]:13: in function <[string "ClassName = class()..."]:13> -- error occurred here
	[C]: in function 'xpcall'
	[string "ClassName = class()..."]:14: in function <[string "ClassName = class()..."]:14> -- error caught here

-- global output (from in setup())

stack traceback:
	[string "-- ProjectName..."]:6: in function <[string "-- ProjectName..."]:6> -- first call / entry point here
	[string "-- ProjectName..."]:5: in function <[string "-- ProjectName..."]:5> -- error occurred here
	[C]: in function 'xpcall'
	[string "-- ProjectName..."]:6: in function <[string "-- ProjectName..."]:6> -- error caught here

pcall stack trace - this is not as detailed as xpcall because it has destroyed the part of the stack (specifically the bit where the error occurred) when it returns true or false for if the call was successful.

In general using just pcall does not seem to give such a nice/detailed stack trace because the stack has already unwound when the function returns but it could be a good starting point for working backwards manually :).
However the callback passed to xpcall is different, this is called before the stack unwinds so this usually has more detail because the call to print(debug.traceback()) is made before xpcall returns and the stack where the error occurred is still available.

     local funcWithError = function() return nil_table[1] end
     local r= coroutine.create(function() if not pcall(funcWithError) then print(debug.traceback())  end  end)
     coroutine.resume(r) 

-- instance output

stack traceback:
	[string "ClassName = class()..."]:14: in function <[string "ClassName = class()..."]:14> 


-- global output

stack traceback:
    [string "-- ProjectName..."]:7: in function <[string "-- ProjectName..."]:7>

And finally (the clean version of) quoting myself scratching my head earlier… “Just what happened here did it work :s, I don’t know the output says nothing so I’m going with yes…”

Nope because the error was eaten up inside the coroutine. Although you can receive a status like pcall from the first return value. False for something went wrong True for all went fine :). A call to print(debug.traceback()) will probably return similar results to that of pcall when ok is false.

     local funcWithError = function() return nil_table[1] end
     local r= coroutine.create(function() funcWithError()  end)
     local ok = coroutine.resume(r) 

-- nothing printed in output

Last but not least some tips for reading stack traces:
Follow your stack traces from the bottom up, the top is the starting point of current the code path, the bottom is where the error emerged (was caught) so following it from the bottom up should take you on a shorter route to the call where the error occurred

You will see some numbers in the stack traces these are line numbers corresponding to the path the code follows/flows through your source files.

Absolutely finally I must say this is by no means a definitive debugging guide, but I hope this will save some headaches when it comes to debugging coroutines or just debugging in general

Jmv38 · June 24, 2013, 7:16pm

Quite interesting thank you!
I had no idea on how to use debug tools of lua, now your post give me an entry point.

XanDDemoX · June 24, 2013, 7:33pm

Thanks @Jmv38, glad you liked it debug info may look like a whole lot of nonsense at a glance and it does take some getting used to but once you get the hang of it, it’s invaluable when investigating problems in your code

Ignatz · June 24, 2013, 7:34pm

@XanDDemoX - what you are saying is really interesting, but it is probably going over a lot of heads, including mine.

I saw there was a pcall and xpcall function in the Lua documentation, but wasn’t sure how to use it. Nor the debug.traceback function.

I’m going to write something about this, and when I do, would you mind checking it for me?

XanDDemoX · June 24, 2013, 7:45pm

Certainly @ignatz. Essentially the bottom line is that the callback in xpcall is called just after your code fails, which is where all the information about the error is because the callback is called within the same scope as xpcall (inside xpcall). Well as in :

 If not pcall(func) then
 -- this happens after your code has failed but also after the pcall function has exited 
-- which means most of the information is lost already because this is outside the chain of events / calls where your code failed (but you still know its failed)
end

Edit: Also to be super clear: you can use xpcall or pcall anywhere you need to handle errors which could potentially happen like converting a string value entered by the user to an integer ( which in lua I think is tonumber()) and then trying to use it as a number when it’s value is actually “nil”. (Which will happen if the user enters something which isn’t a valid number and will cause a runtime error )

Briarfox · June 25, 2013, 1:40am

Thank you @XanDDemoX I was having an issue tracing errors in my coroutines earlier today! Time to play with this some and get a better understanding of how it works!

XanDDemoX · June 25, 2013, 2:56am

Thanks @Ignatz and @Briarfox. Glad you have found it useful hopefully my pain has been your gain

Briarfox · June 25, 2013, 5:09pm

I’m still trying to wrap my head around coroutines. Does coroutine.wrap() work in codea? I keep getting an error.

Ignatz · June 25, 2013, 6:25pm

If you figure that one out, let me know!

SkyTheCoder · June 25, 2013, 6:53pm

I don’t even know what coroutines are…

Ignatz · June 25, 2013, 7:37pm

@SkyTheCoder- try this.
http://coolcodea.wordpress.com/2013/03/25/coroutines-or-how-to-report-progress-of-a-big-job/

XanDDemoX · June 25, 2013, 8:07pm

@Briarfox , @Ignatz I haven’t really used them either but wrap certainly does work in codea here’s a wrap example to try out, it’s a iterator that always counts at the increment given when its created but can then also count an extra amount on a per call basis so the result from yield will be:

Edit: Also to note in this example the value calculated from the previous call is returned on the next

Edit 2: If you need the value to be returned on the current call then perform the calculation before yield.

newValue = (CurrentValue+InitialIncrement+Extra).

(I think I may have inadvertently combined a couple of examples I’ve seen floating about on google for this )


function() setup

    local iter = function(n) 
        local i = 0
        
        return coroutine.wrap(function(extra) 
            while true do

                coroutine.yield(i)
                 i=i+n+extra
            end
        end)
        
    end
    
    local wrapped =iter(1) --create and set initial increment
    
    parameter.integer("Initial_Increment",1,10)
    parameter.action("Set Initial Increment",function() r = wrapped(Initial_Increment) end)
    parameter.integer("Extra_increment",0,10)
    parameter.action("Call", function() print(wrapped(Extra_increment)) end)

end

Ignatz · June 25, 2013, 8:29pm

Cool, thanks

Briarfox · June 25, 2013, 8:32pm

Awesome, thank you!

XanDDemoX · June 25, 2013, 8:54pm

Here’s an extension of the example above which hopefully shows a more practical use for the initial values this can count backwards or forwards based on whether you pass a negative or positive number to the call to c() but it will stop counting when it reaches min or max

    local counter = function(start,min,max) 
        local i = start or 0
        
        return coroutine.wrap(function(n) 
            while true do
                
                if i >= min and i < max then
                    i=i+n
                    coroutine.yield(i,true)
                else
                    coroutine.yield(i,false)
                end
                
                
            end
        end)
        
    end    
    
    local c = counter(0,0,10)
    
    local keepcounting = false
    local count = 0
    repeat
        print("Before: ",count,keepcounting)
        count,keepcounting = c(1)
        print("After:  ",count,keepcounting)
    until keepcounting == false

Jmv38 · June 26, 2013, 1:38am

Here is an exemple of wrap i made for myself to understand how to use coroutines. It shows i can do a long function that take may seconds but still can have a real time screen update (the progress bar and number showing).



--# Main
-- test coroutine

-- Use this function to perform your initial setup
function setup()
    print("trying to use coroutines")
    x1=0
    fps = FPS()
    fps:progressBarInit("long fonction")
end

longFonc2 = coroutine.wrap(function()
    local xmax=100000
    local t0,dt = 0,0.003
    local restart = true
    local killMe = false
    local x = xmax
    local justFinished = false
    while not killMe do
        if restart then 
            x=0 
--            print("t0")
            restart = false
        end
        t0 = os.clock()+ dt
        if x<xmax then
            for i=1,xmax do
                x = x + 1
                if os.clock()>t0 then 
                    fps:progressBarUpdate(x/xmax)
                    coroutine.yield(x,false) 
                    t0 = os.clock()+ dt
                end
            end
            justFinished = true
        end
       if justFinished then 
            fps:progressBarUpdate(x/xmax)
            justFinished = false
        end
       restart,_,t0 = coroutine.yield(x,true) 
    end
end)

caller = coroutine.wrap(function()
    while true do
    while a or not v do
        u,v = longFonc2(a,b,c)
        a,b,c = coroutine.yield(u,v) 
    end
    a,b,c = coroutine.yield(u,v) 
    end
end)

-- This function gets called once every frame
function draw()
    -- This sets a dark background color 
    background(40, 40, 50)

    -- This sets the line thickness
    strokeWidth(5)

    -- Do your drawing here
    local a,b
    --[[
    if math.floor(ElapsedTime/5)~=math.floor((ElapsedTime-DeltaTime)/5) then 
        a,b = longFonc2(true,false,ElapsedTime)
        print(ElapsedTime,a,b)
    else a,b = longFonc2() end
    ]]
    if math.floor(ElapsedTime/5)~=math.floor((ElapsedTime-DeltaTime)/5) then 
        a,b = caller(true,false,ElapsedTime)
        print(ElapsedTime,a,b)
    else a,b = caller() end
    
    
     fps:draw()
    -- write the x on the screen
    fill(208, 208, 208, 255)
    fontSize(100)
    font("AmericanTypewriter-Bold")
    rectMode(CENTER)
    text(tostring(a),WIDTH/2,HEIGHT/2)
    
end


--# FPS
FPS = class()

-- this manages FPS and a progress bar
function FPS:init()
    -- average fps
    self.val = 60
    self.t0 = ElapsedTime
    -- min fps
    self.min = 60
    self.t1 = ElapsedTime
    -- progress bar
    self.frac = 0
    self:progressBarInit()
end

function FPS:draw()
    local vShift = 0
    if self.progressBarActive then vShift = 30 end
    -- update FPS value with some smoothing
    local old = self.val
    local frac = self.frac
--    local t1 = os.clock()
    local t1 = ElapsedTime
    local delta = t1 - self.t0
    self.t0 = t1
    local new = 1/delta or old
    if new<self.min then self.min=new; self.t1=ElapsedTime+1 end
    if self.t1<ElapsedTime then self.min=60 end
    if new>65 then new=65 end
    local ratio = new/old
    if 0.5<ratio and ratio<2 then new = old*(1-frac)+ new*frac end
    self.val = new

    -- write the FPS on the screen
    fill(208, 208, 208, 255)
    fontSize(20)
    font("AmericanTypewriter-Bold")
    rectMode(CENTER)
    text(math.floor(new).." fps (> "..math.floor(self.min)..")",70,HEIGHT-15-vShift)
    -- draw progress bar
    self:progressBarDraw()
end


function FPS:progressBarInit(txt)
    self.frac = 0
    self.txt = txt or "running"
    self.img = self:progressBarCalcInfoImg(self.txt,WIDTH*0.19,30,"top")
    self.progressBarActive = false
end

function FPS:progressBarUpdate(frac)
    self.frac = frac
    if frac>0 and frac<1 then self.progressBarActive = true
    else self.progressBarActive = false end
end

-- image to show job progress
function FPS:progressBarCalcInfoImg(txt,w,h,rot)
    local w0,h0
    pushStyle() pushMatrix()
    if rot=="left" or rot=="right" 
    then w0,h0 = h,w
    else w0,h0 = w,h
    end
    local img0 = image(w0,h0)
    setContext(img0)
        font("AmericanTypewriter-Bold")
        rectMode(CENTER)
        textMode(CENTER)
        strokeWidth(1)
        background(255, 255, 255, 255)
        fill(0, 0, 0, 255)
        stroke(0, 0, 0, 255)
        fontSize(20)
        text(txt,w0/2,h0/2)
    setContext()
    local img = image(w,h)
    setContext(img)
        background(0, 0, 0, 255)
        spriteMode(CENTER)
        translate(w/2,h/2)
        if rot=="left" then rotate(-90) end
        if rot=="right" then rotate(90) end
        sprite(img0,0,0)
    setContext()
    popStyle() popMatrix()
    return img
end

function FPS:progressBarDraw()
    if self.progressBarActive then
    local img = self.img
    pushStyle()
        spriteMode(CORNER)
            tint(128, 128, 128, 255)
            sprite(img, 0, HEIGHT - img.height) 
            tint()
            tint(255, 255, 255, 255)
            clip(0,HEIGHT  - img.height,self.frac*img.width,img.height)
            sprite(img, 0, HEIGHT - img.height) 
            clip()
    popStyle()
    end
end

Briarfox · June 26, 2013, 2:54pm

How to you restart a coroutine.wrap()? Once it hits dead it seems to stay dead…

XanDDemoX · June 26, 2013, 5:20pm

@Jvm38 nicely done :), looks like your really getting the hang of it! :).

@Briarfox it needs to be kept alive inside the function your “wrapping” or re-created as required depending on your requirements ;). Coroutines are mainly used to do long tasks (but they do have some other interesting uses, i saw an example somewhere where one was used for inversion of control).

They usually require some sort of loop and then you yield in between iterations to do other tasks or update the user interface with progress for example. A loop will keep your coroutine alive until the condition which stops the loop is met. Above I’ve kept mine alive indefinitely using an infinate loop:

coroutine.wrap(function()
     while true do
       --- do stuff 
             vars,from,resume = coroutine.yield(return,vars,for,resume)
       end
     -- will never reach here, but will still die on error.
end)

But you could use just a normal “for” loop to iterate through a large collection performing complex calculations, for example, or “repeat until” as well

Coroutines will “die” upon error or falling out past the functions “end” so the trick is to keep it “suspended” inside your wrapped function until you no longer need it / your long running task is complete, whilst yielding to do other things

Briarfox · June 26, 2013, 5:53pm

@XanDemoX Think I figured it out. I do not believe you can restart a coroutine.wrap once it has died. Because if you call it, it returns that it’s dead. I couldnt figure out a way to restart it.

I ended up just using create and it fixed my issues. Thank you all for the help and sharing code

XanDDemoX · June 26, 2013, 6:18pm

@Briarfox, my apologies i wasn’t massively clear on that point :), you are 100% correct, you can’t resurrect one once its dead (but you can stop that happening )