Hi Codeans,
Having hit a certain level of complexity, I am now hitting performance problems in my game. As these are fairly tricky to track down I have written a basic profiler to help track down which functions are causing issues…
The profiler can automatically track an object’s function invocations then report on number of invocations, total time spent in a function over the profiler run, and then from these create an average time per invocation…
Here is the class:
Profiler4Codea = class()
Profiler4Codea.TimesInvoked = "timesInvoked"
Profiler4Codea.TotalTime = "totalTime"
Profiler4Codea.AvgTime = "avgTime"
Profiler4Codea.Descending = "descending"
Profiler4Codea.Ascending = "ascending"
function Profiler4Codea:init(obj)
self.obj = obj
if(type(obj) ~= "table") then
error("Profiler4Codea:init: obj must be table or class: "..tostring(self.obj))
end
local metaTable = getmetatable(self.obj)
if(metaTable) then
self.obj = metaTable
end
self.metaData = {}
end
function Profiler4Codea:start()
self.clockTime = os.clock()
for name, member in pairs(self.obj) do
local mType = type(member)
if(mType == "function") then
if(not self.metaData[name]) then
self.metaData[name] = {
totalTime = 0,
timesInvoked = 0,
func = name
}
end
self.metaData[name].origFunction = member
self.obj[name] = function(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10)
local elapsedTime = os.clock()
local r1, r2, r3, r4, r5, r6, r7, r8, r9, r10 =
self.metaData[name].origFunction(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10)
self.metaData[name].totalTime = self.metaData[name].totalTime +
(os.clock() - elapsedTime)
self.metaData[name].timesInvoked = self.metaData[name].timesInvoked + 1
return r1, r2, r3, r4, r5, r6, r7, r8, r9, r10
end
elseif(mType == "table") then
-- Could recurse here for nested/inner-classed objects
end
end
end
function Profiler4Codea:stop()
-- Restore function pointers
for name, meta in pairs(self.metaData) do
self.obj[name] = meta.origFunction
end
return self:report()
end
function Profiler4Codea:report(sortKey, ascDesc)
sortKey = sortKey or Profiler4Codea.TimesInvoked
ascDesc = ascDesc or Profiler4Codea.Descending
local data = {}
for name, meta in pairs(self.metaData) do
-- Calculate average time so we can sort by if required
if(meta.timesInvoked > 0) then
meta.avgTime = meta.totalTime/meta.timesInvoked
else
meta.avgTime = 0
end
table.insert(data, meta)
end
if(ascDesc == Profiler4Codea.Descending) then
table.sort(data, function(a,b) return b[sortKey] < a[sortKey] end)
elseif(ascDesc == Profiler4Codea.Ascending) then
table.sort(data, function(a,b) return a[sortKey] < b[sortKey] end)
else
error("Unknown sort key")
end
local sb = {}
table.insert(sb, "Profiler4Codea: Sample time: ")
table.insert(sb, os.clock() - self.clockTime)
table.insert(sb, "\\r\
")
table.insert(sb, "Sort key: ")
table.insert(sb, sortKey)
table.insert(sb, "\\r\
")
table.insert(sb, "Asc/Desc: ")
table.insert(sb, ascDesc)
table.insert(sb, "\\r\
")
for k, meta in ipairs(data) do
if(meta.timesInvoked > 0) then
table.insert(sb, " function:")
table.insert(sb, meta.func)
table.insert(sb, "\\r\
")
table.insert(sb, " totalTime: ")
table.insert(sb, tostring(meta.totalTime))
table.insert(sb, "\\r\
")
table.insert(sb, " timesInvoked: ")
table.insert(sb, tostring(meta.timesInvoked))
table.insert(sb, "\\r\
")
table.insert(sb, " avgTime: ")
table.insert(sb, string.format("%.6f", meta.avgTime))
table.insert(sb, "\\r\
")
end
end
return table.concat(sb)
end
Basically you pass an object into the profiler, then call start(), stop() to begin and profiling…e.g.
local p4C = Profiler4Codea(object)
p4C:start()
.....
local output = p4C:stop()
The profiler replaces all functions with a wrapper function that calculates the invocation time and the number of invocations. The stop() command then reinstates the functions and generates a report string, which looks something like:
Profiler4Codea: Sample time: 167.944
Sort key: timesInvoked
Asc/Desc: descending
function:dataMapHas
totalTime: 3.37
timesInvoked: 363530
avgTime: 0.000009
function:getMap
totalTime: 4.61807
timesInvoked: 168544
avgTime: 0.000027
function:getViewshedCount
totalTime: 8.69559
timesInvoked: 147452
avgTime: 0.000059
By default the output is sorted by the number of times function invoked in descending order. You can also generate the report by passing in a sort key and a direction e.g.:
local output = p4C:report(Profiler4Codea.AvgTime, Profiler4Codea.Ascending)
Notes:
-
This has been tested with classes, but should work with table objects with functions
-
Although the wrapper functions do add additional overhead, as do all instrumentation solutions, the overhead should be constant
-
The rather horrible p1-p10, and r1-r10 are done because using unpack() is very expensive, and return values have to be captured and then passed back to do the time tracking
-
os.clock() is used as ElapsedTime is only updated per draw frame, and os.clock() appears to give milisecond times
You can either create a global profiler instance and then drive it from the console, or you could invoke the :start() and then put the :stop() in a tween after e.g. 10 seconds…
This could be expanded to recurse through an object to track e.g. ‘inner’ class function calls, with nicely indented output, but that is an exercise for the interested,
Cheers,
Brookesi