Profiler4Codea: A simple profiling tool

Hi Codeans,

Having hit a certain level of complexity, I am now hitting performance problems in my game. As these are fairly tricky to track down I have written a basic profiler to help track down which functions are causing issues…

The profiler can automatically track an object’s function invocations then report on number of invocations, total time spent in a function over the profiler run, and then from these create an average time per invocation…

Here is the class:

Profiler4Codea = class()

Profiler4Codea.TimesInvoked = "timesInvoked"
Profiler4Codea.TotalTime = "totalTime"
Profiler4Codea.AvgTime = "avgTime"

Profiler4Codea.Descending = "descending"
Profiler4Codea.Ascending = "ascending"

function Profiler4Codea:init(obj)
    
    self.obj = obj
    if(type(obj) ~= "table") then 
        error("Profiler4Codea:init: obj must be table or class: "..tostring(self.obj))
    end
    local metaTable = getmetatable(self.obj)
    if(metaTable) then
        self.obj = metaTable
    end
    self.metaData = {}
end

function Profiler4Codea:start()
    
    self.clockTime = os.clock()
    
    for name, member in pairs(self.obj) do
        local mType = type(member)
        if(mType == "function") then
            
            if(not self.metaData[name]) then
                self.metaData[name] = {
                    totalTime = 0,
                    timesInvoked = 0,
                    func = name
                }
            end
            self.metaData[name].origFunction = member
            self.obj[name] = function(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10)   
                local elapsedTime = os.clock()
                local r1, r2, r3, r4, r5, r6, r7, r8, r9, r10 = 
                    self.metaData[name].origFunction(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10)
                self.metaData[name].totalTime = self.metaData[name].totalTime + 
                    (os.clock() - elapsedTime)
                self.metaData[name].timesInvoked = self.metaData[name].timesInvoked + 1
                return r1, r2, r3, r4, r5, r6, r7, r8, r9, r10
            end    
        elseif(mType == "table") then
            -- Could recurse here for nested/inner-classed objects  
        end
    end
end

function Profiler4Codea:stop()
    -- Restore function pointers
    for name, meta in pairs(self.metaData) do
        self.obj[name] = meta.origFunction
    end
    return self:report()
end

function Profiler4Codea:report(sortKey, ascDesc)
    
    sortKey = sortKey or Profiler4Codea.TimesInvoked
    ascDesc = ascDesc or Profiler4Codea.Descending
    
    local data = {}
    for name, meta in pairs(self.metaData) do
        -- Calculate average time so we can sort by if required
        if(meta.timesInvoked > 0) then
            meta.avgTime = meta.totalTime/meta.timesInvoked
        else
            meta.avgTime = 0
        end
        table.insert(data, meta)
    end
    
    if(ascDesc == Profiler4Codea.Descending) then
        table.sort(data, function(a,b) return b[sortKey] < a[sortKey]  end)
    elseif(ascDesc == Profiler4Codea.Ascending) then
        table.sort(data, function(a,b) return a[sortKey] < b[sortKey]  end)
    else
        error("Unknown sort key")
    end
    local sb = {}
    table.insert(sb, "Profiler4Codea: Sample time: ")
    table.insert(sb, os.clock() - self.clockTime)
    table.insert(sb, "\\r\
")
    
    table.insert(sb, "Sort key: ")
    table.insert(sb, sortKey)
    table.insert(sb, "\\r\
")
    
    table.insert(sb, "Asc/Desc: ")
    table.insert(sb, ascDesc)
    table.insert(sb, "\\r\
")
    
    for k, meta in ipairs(data) do
        if(meta.timesInvoked > 0) then
            table.insert(sb, "  function:")
            table.insert(sb, meta.func)
            table.insert(sb, "\\r\
")
            table.insert(sb, "    totalTime: ")
            table.insert(sb, tostring(meta.totalTime))
            table.insert(sb, "\\r\
")
            table.insert(sb, "    timesInvoked: ")
            table.insert(sb, tostring(meta.timesInvoked))
            table.insert(sb, "\\r\
")
            table.insert(sb, "    avgTime: ")
            table.insert(sb, string.format("%.6f", meta.avgTime))
            table.insert(sb, "\\r\
") 
        end   
    end
    return table.concat(sb)
end

Basically you pass an object into the profiler, then call start(), stop() to begin and profiling…e.g.

local p4C = Profiler4Codea(object)
p4C:start()
.....
local output = p4C:stop()

The profiler replaces all functions with a wrapper function that calculates the invocation time and the number of invocations. The stop() command then reinstates the functions and generates a report string, which looks something like:

Profiler4Codea: Sample time: 167.944
Sort key: timesInvoked
Asc/Desc: descending
  function:dataMapHas
    totalTime: 3.37
    timesInvoked: 363530
    avgTime: 0.000009
  function:getMap
    totalTime: 4.61807
    timesInvoked: 168544
    avgTime: 0.000027
  function:getViewshedCount
    totalTime: 8.69559
    timesInvoked: 147452
    avgTime: 0.000059

By default the output is sorted by the number of times function invoked in descending order. You can also generate the report by passing in a sort key and a direction e.g.:

local output = p4C:report(Profiler4Codea.AvgTime, Profiler4Codea.Ascending)

Notes:

  • This has been tested with classes, but should work with table objects with functions

  • Although the wrapper functions do add additional overhead, as do all instrumentation solutions, the overhead should be constant

  • The rather horrible p1-p10, and r1-r10 are done because using unpack() is very expensive, and return values have to be captured and then passed back to do the time tracking

  • os.clock() is used as ElapsedTime is only updated per draw frame, and os.clock() appears to give milisecond times

You can either create a global profiler instance and then drive it from the console, or you could invoke the :start() and then put the :stop() in a tween after e.g. 10 seconds…

This could be expanded to recurse through an object to track e.g. ‘inner’ class function calls, with nicely indented output, but that is an exercise for the interested,

Cheers,
Brookesi

@Brookesi Very cool, I actually added something similar to your log4Codea project.

Cool. ~O)