Performance of arc mesh/shader vs ellipse function

Andrew_Stacey · May 10, 2014, 10:38am

This creates the arc “object”: a mesh with a shader attached. The arc function calls this with different parameters passed to the shader. The Arc class uses this function to create a new mesh for every instance.

So using the arc function means one mesh, but every invocation requires fresh data passed to the GPU. Using the Arc class means many meshes, but each is hard-coded with its data (though it can be changed).

local __makeArc = function(nsteps)
    -- nsteps doesn't make a huge difference in the range 50,300
    nsteps = nsteps or 50
    local m = mesh()
    m.shader = shader([[

Now we examine the shader code, starting with the vertex shader.

//
// A basic vertex shader
//

//This is the current model * view * projection matrix
// Codea sets it automatically
uniform mat4 modelViewProjection;

The uniforms scolour and ecolour hold the start and end colours of the arc. From these, we compute the difference “colour” (it may not be a valid colour, but that’s not important).

uniform lowp vec4 scolour;
uniform lowp vec4 ecolour;
lowp vec4 mcolour = ecolour - scolour;

Standard mesh attributes and varyings.

//This is the current mesh vertex position, color and tex coord
// Set automatically
attribute vec4 position;
attribute vec4 color;
attribute vec2 texCoord;

varying highp vec2 vTexCoord;
varying lowp vec4 vColour;
varying highp float vWidth;

The varying vCore holds the parameter along the curve clamped to [0,1] which will be used in the fragment shader to deal with the ends. The body of the curve corresponds to y positions in [0,1] and the caps extend beyond, so if y == vCore then we’re in the body of the curve and if y ~= vCore then we’re in one of the caps.

varying highp float vCore;

These constants control the appearance of the curve. width is hopefully obvious, taper is how much the width varies, blur is how much the edges are blurred. So the actual starting width, swidth, is width + blur and the actual ending width, ewidth, is taper * width - width (hmm, now that I look at it maybe there should be a + blur here!). We can set a global cap type for both ends and scap and ecap are used to turn it on or off for start and finish separately. The variables ecapsw and scapsw contain how far the caps extend from the curve.

uniform float width;
uniform float taper;
uniform float blur;
uniform float cap;
uniform float scap;
uniform float ecap;
float swidth = width + blur;
float ewidth = taper*width - width;
float ecapsw = clamp(cap,0.,1.)*ecap;
float scapsw = clamp(cap,0.,1.)*scap;

This set of parameters control the actual arc parameters. The angles are in radians. The axes are vectors as they needn’t be the x-axis and y-axis.

uniform vec2 centre;
uniform vec2 xaxis;
uniform vec2 yaxis;
uniform float startAngle;
uniform float deltaAngle;

Now we enter the main function. The idea of the shader is that the mesh is a series of rectangles running from (-.5,0) to (.5,1) (well, it extends slightly above and below for the line caps). The y-coordinate is used to parametrise the arc and the x-coordinate goes across the curve. So the vertex shader needs to move these vertices into the right places and scale them accordingly.

void main()
{

As said above, the y value is in the interval [0,1] except for the caps which extend slightly out, so t is the actual parameter along the curve.

    highp float t = clamp(position.y,0.,1.);
    vCore = t;

To compute the width, we need to interpolate between the starting and ending widths. Rather than a linear interpolation, we use smoothstep so that the width doesn’t change much at the start and end of the arc. This means that if we put two arcs next to each other with matching widths then there won’t be a noticeable change in how much the widths change.

    highp float w = smoothstep(0.,1.,t);
    vWidth = w*ewidth + swidth;

This contains the position along the curve.

    highp vec2 bpos = centre + cos(t*deltaAngle + startAngle) * xaxis + sin(t*deltaAngle + startAngle) * yaxis;

To work out the thickness, we need to know the normal vector at this point on the curve. We start with the tangent vector.

    highp vec2 bdir = -sin(t*deltaAngle + startAngle) * xaxis + cos(t*deltaAngle + startAngle) * yaxis;

Rotate it to get the normal vector.

    bdir = vec2(bdir.y,-bdir.x);

Normalise, and multiply by the width and the x-coordinate of the vertex.

    bdir = vWidth*normalize(bdir);
    bpos = bpos + position.x*bdir;

Convert this to a vec4 since that’s what the shader needs to produce at the end.

    highp vec4 bzpos = vec4(bpos.x,bpos.y,0.,1.);

The caps need special handling. This ensures that they poke out a bit from the ends of the arcs in the right directions.

    bzpos.xy += (ecapsw*max(position.y-1.,0.)
                +scapsw*min(position.y,0.))*vec2(-bdir.y,bdir.x);
    highp float s = clamp(position.y, 
            scapsw*position.y,1.+ecapsw*(position.y-1.));
    vTexCoord = vec2(texCoord.x,s);

This blends the colours.

    vColour = t*mcolour + scolour;
    //Multiply the vertex position by our combined transform
    gl_Position = modelViewProjection * bzpos;
}
]],[[
//
// A basic fragment shader
//

Now the fragment shader, we need to know the blur and cap type.

uniform highp float blur;
uniform highp float cap;

All the varyings we want from the vertex fragment.

varying highp vec2 vTexCoord;
varying highp float vWidth;
varying lowp vec4 vColour;
varying highp float vCore;

void main()
{

Start with the main colour (the blending from start to finish is already taken care of).

    lowp vec4 col = vColour;

This tells us where the edge should be.

    highp float edge = blur/(vWidth+blur);

The alpha is modified to take into account the edge and the cap type.

    col.a = mix( 0., col.a, 
            (2.-cap)*smoothstep( 0., edge, 
                min(vTexCoord.x,1. - vTexCoord.x) )
            * smoothstep( 0., edge, 
                min(1.5-vTexCoord.y, .5+vTexCoord.y) ) 
            + (cap - 1.)*smoothstep( 0., edge,
             .5-length(vTexCoord - vec2(.5,vCore)))
                );

    gl_FragColor = col;
}
]])

Back to the lua code, as said in the shader the arc is a mesh filling in the rectangle from (-.5,0) to (.5,1). We use lots of rectangles to make the arc smooth.

    for n=1,nsteps do
        m:addRect(0,(n-.5)/nsteps,1,1/nsteps)
    end
    m:addRect(0,1.25,1,.5)
    m:addRect(0,-.25,1,.5)
    return m
end

There’s a lot of flexibility in there which could be removed to speed it up a bit. One thing it doesn’t do is fill in the arc: it is definitely an arc and not a segment of a circle.

bee · May 11, 2014, 8:28am

@andrew_stacey: Thank you so much for the explanation. I’ll play around with your code. Yes, I knew it draws an arc. That’s why I asked for a guide how to make it into drawing a sector and segment. I still don’t have a clear understanding about how to fill an area using shader.

Jmv38 · May 11, 2014, 8:58am

@bee the problem is that when you’ll want to fill the area, you’ll be back to your fps problem…

Coder · May 11, 2014, 10:16am

If you don’t need to change the angle of your arc you could just draw it once with setContext()

bee · May 11, 2014, 11:09am

@jvm38: Of course. But I think it should be possible using OpenGL (as Codea use it as backend). Some games are able to draw thousands of meshes with acceptable speed, even on iPad1.

@coder: If I want to draw static arc, it wouldn’t be this difficult.

Jmv38 · May 11, 2014, 12:22pm

@bee the problem is not the number of sprites but the total number of screen pixels that have to be redrawn over and over again when you superpose many very large circles, as you want to do (if i’ve understood you correctly). Unless you use some additionnal tricks to skip drawing the hidden circle parts, you’ve already exausted the ipad gpu and you wont succed to be faster with ‘brute force’. As you’ve said, codea mesh uses OpenGL, very directly i assume, so there not much to hope from a direct drawing approch. I say all this just because this is my current understanding of OpenGL, but i am not an expert, so i may be wrong.

bee · May 11, 2014, 12:44pm

Then how those OpenGL-based games are able to draw hundreds even thousands of meshes so fast? Maybe we should summon the experts, @simeon and @john, to join our discussion and give us some enlightments.

Jmv38 · May 11, 2014, 12:59pm

Can you tell which games draws hundred of full-screen-size meshes simultaneously on the screen? I am really puzzled. (i have drawn myself meshes with 60 000 vertex on the ipad1 with very good speed, but in practice each pixel of the screen would correspond to only 2 triangles max. Your case is very different, since the cicles do overlap and each covers a large fraction of the screen…).

Andrew_Stacey · May 11, 2014, 5:11pm

@bee What I meant was that my mesh/shader approach draws the arc and only the arc. It doesn’t cover the segment of the circle. The arc shader that comes with Codea, on the other hand, does draw the segment of the circle (though possibly with transparent pixels, which is one reason that it is so slow). On the other hand, it shouldn’t be difficult to adapt my shader to draw the whole segment. Basically, the left-hand side of the curve should go to the centre point. That ought to do it.

SkyTheCoder · May 11, 2014, 5:45pm

@bee Most OpenGL-based games are made in Objective-C and OpenGL. Codea is much slower, probably mainly because Lua is a very slow language in general. It’s not the standard, built-in programming language, it has a bunch of other complications. Objective-C is native, so it is much faster.

(Correct me if I’m wrong.)

bee · May 11, 2014, 10:38pm

@jmv38: Infinity Blade? EA Games’ games? CMIIW.

@andrew_stacey: I can understand how to make a segment from an arc. What I don’t quite understand is how to fill it. Especially to have an acceptable speed.

@skythecoder: Yes, I thought about it as well. I also read that trigonometry functions is very expensive. It’s suggested to use lookup table instead of calculate it during runtime.

Andrew_Stacey · May 12, 2014, 2:47am

Hmm, it would be interesting to compare the arc code above (with sines and cosines) with a bezier approximation (since that only involves polynomials). I have code for both so I’ll give it a go and report back.

With regard to filling the segment, that’s what I was talking about. Simply send one edge of the rectangle to the centre and you’re done.

Jmv38 · May 12, 2014, 3:26am

@bee i have added a taper to smooth the circle edges for you. The visual result is no longer pixellized (but a bit fuzzy, of course!). Let me know how you like that one. Here is the code.


--# Main
-- Arc
-- on ipad air:
-- normal: #arcs: 40 => fps: 19
-- sprite: #arcs: 90 => fps: 56



-- Use this function to perform your initial setup
function setup()
    fps = 60
    parameter.watch("math.floor(fps)")
    parameter.watch("#arcs")
    parameter.action("print",status)
    arcs = {}
    img = image(WIDTH/4,HEIGHT/4)
end
function status()
    print("#arcs: "..tostring(#arcs) .." => fps: "..tostring(math.floor(fps)))
end
-- This function gets called once every frame
function draw()
    k = 0.01
    fps = fps*(1-k) + k/DeltaTime
    -- This sets a dark background color 

    setContext(img)
    background(255, 255, 255, 255)
    scale(1/4)
    for i, arc in ipairs(arcs) do
        arc:draw()
    end
    setContext()
    resetMatrix()
    background(255, 255, 255, 255)
    spriteMode(CORNER)
    sprite(img,0,0,WIDTH,HEIGHT)
end

function touched(touch)
    if touch.state == ENDED then
        table.insert(arcs, Arc(touch.x, touch.y,WIDTH))
        arcs[#arcs]:start()
    end
end


--# shader
arcShader = {
vprog = [[
//
// Vertex shader: Arc
//

uniform mat4 modelViewProjection;

attribute vec4 position;
attribute vec2 texCoord;

varying highp vec2 vTexCoord;

void main() {
    vTexCoord = texCoord;
    gl_Position = modelViewProjection * position;
}

]],
fprog = [[
//
// Fragment shader: Arc
//

precision highp float;

uniform float size;
uniform float a1;
uniform float a2;
uniform float pixelSize;
uniform vec4 color;
const float da = 0.01;
varying vec2 vTexCoord;

void main() {
    vec4 col = vec4(0.0);
    vec2 r = vTexCoord - vec2(0.5);
    float d = length(r);
    float taper = 1.0;
    if (d > size && d < 0.5) {
        taper = 1.0 - smoothstep(0.5-pixelSize*4.0, 0.5, d);
        taper = taper * smoothstep(size, size +pixelSize*4.0,d);
        float a = atan(r.y, r.x);
        if (a2 > a1) {
            if (a > a1 && a < a2) {
                col = color;
            }
        } else {
            if (a > a1 || a < a2) {
                col = color;
                if (a > a1) {
                    taper = taper * smoothstep(a1, a1+0.01, a);
                } else {
                    taper = taper * (1.0 - smoothstep(a2-0.01, a2, a));
                }
            }
        }
    }
    gl_FragColor = col * vec4(1.0,1.0,1.0,taper);
}

]]
}




--# Timer
Arc = class()

function Arc:init(x, y, s, thick, col, t)
    self.x = x or WIDTH / 2
    self.y = y or HEIGHT / 2
    self.size = s or WIDTH / 10
    self.thick = thick or WIDTH / 10
    self.time = t or 10
    local ran = function() return math.random(255) end
    self.color = col or color(ran(), ran(), ran(), 255)

    self.paused = false
    self.amnt = 0

    self.tMesh = mesh()
    self.tMesh.vertices = triangulate({vec2(-self.size / 2, -self.size / 2),
                        vec2(-self.size / 2, self.size / 2),
                        vec2(self.size / 2, self.size / 2),
                        vec2(self.size / 2, -self.size / 2)})
--    self.tMesh.shader = shader("Patterns:Arc")
    self.tMesh.shader = shader(arcShader.vprog, arcShader.fprog)
    
    self.tMesh.shader.a1 = math.pi
    self.tMesh.shader.a2 = math.pi
    self.tMesh.shader.pixelSize = 1/s
    self.tMesh.shader.size = .1
    self.tMesh.shader.color = self.color
    self.tMesh.texCoords = triangulate({vec2(0,0),vec2(0,1),vec2(1,1),vec2(1,0)})
end

function Arc:start()
    self.amnt = 0
    if self.timing == nil then
        self.timing = tween(self.time, self, { amnt = 1 }, tween.easing.linear)
    end
end

function Arc:pause()
    if self.timing ~= nil then
        tween.stop(self.timing)
    end

    self.paused = true
end

function Arc:resume()
    if self.timing ~= nil then
        tween.play(self.timing)
    end

    self.paused = false
end

function Arc:stop()
    if self.timing ~= nil then
        tween.stop(self.timing)
        self.timing = nil
    end
end

function Arc:restart()
    self:stop()
    self:start()
end

function Arc:draw()
    -- Update timer
    -- self.tMesh.shader.color = vec4(1 * self.amnt, 1 - (1 * self.amnt), 0, 1)
    self.tMesh.shader.a2 = -self.amnt * (math.pi * 2) + math.pi

    -- Draw timer
    pushMatrix()

    translate(self.x, self.y)

    rotate(270.1)

    self.tMesh:draw()

    popMatrix()
end

Jmv38 · May 12, 2014, 3:30am

Jmv38 · May 12, 2014, 3:33am

There are still a couple defects in the ‘almost horizontals’ or verticals, to be removed with a larger angle deviation than 0.01 (or a better formula, because tha angle deviation does not correspond to a fixed distance).
[edit].
the program below shows more clearly the weak region: actually, it is only for a1=pi that there is a problem. Just avoid this value.

Jmv38 · May 12, 2014, 3:48am


--# Main
-- Arc
-- on ipad air:
-- normal: #arcs: 40 => fps: 19
-- sprite: #arcs: 90 => fps: 56



-- Use this function to perform your initial setup
function setup()
    fps = 60
    parameter.watch("math.floor(fps)")
    parameter.watch("#arcs")
    parameter.action("print",status)
    arcs = {}
    img = image(WIDTH/4,HEIGHT/4)
end
function status()
    print("#arcs: "..tostring(#arcs) .." => fps: "..tostring(math.floor(fps)))
end
-- This function gets called once every frame
function draw()
    k = 0.01
    fps = fps*(1-k) + k/DeltaTime
    -- This sets a dark background color 

    setContext(img)
    background(255, 255, 255, 255)
    scale(1/4)
    for i, arc in ipairs(arcs) do
        arc:draw()
    end
    setContext()
    resetMatrix()
    background(255, 255, 255, 255)
    spriteMode(CORNER)
    sprite(img,0,0,WIDTH,HEIGHT)
end

function touched(touch)
    if touch.state == ENDED then
        table.insert(arcs, Arc(touch.x, touch.y,WIDTH))
        arcs[#arcs]:start()
    end
end


--# shader
arcShader = {
vprog = [[
//
// Vertex shader: Arc
//

uniform mat4 modelViewProjection;

attribute vec4 position;
attribute vec2 texCoord;

varying highp vec2 vTexCoord;

void main() {
    vTexCoord = texCoord;
    gl_Position = modelViewProjection * position;
}

]],
fprog = [[
//
// Fragment shader: Arc
//

precision highp float;

uniform float size;
uniform float a1;
uniform float a2;
uniform float pixelSize;
uniform vec4 color;
const float da = 0.02;
varying vec2 vTexCoord;

void main() {
    vec4 col = vec4(0.0);
    vec2 r = vTexCoord - vec2(0.5);
    float d = length(r);
    float taper = 1.0;
    if (d > size && d < 0.5) {
        taper = 1.0 - smoothstep(0.5-pixelSize*4.0, 0.5, d);
        taper = taper * smoothstep(size, size +pixelSize*4.0,d);
        float a = atan(r.y, r.x);
        if (a2 > a1) {
            if (a > a1 && a < a2) {
                col = color;
                taper = taper * smoothstep(a1, a1+da, a);
                taper = taper * (1.0 - smoothstep(a2-da, a2, a));
            }
        } else {
            if (a > a1 || a < a2) {
                col = color;
                if (a > a1) {
                    taper = taper * smoothstep(a1, a1+da, a);
                } else {
                    taper = taper * (1.0 - smoothstep(a2-da, a2, a));
                   // taper = 0.0;
                }
            }
        }
    }
    gl_FragColor = col * vec4(1.0,1.0,1.0,taper);
}

]]
}




--# Timer
Arc = class()

function Arc:init(x, y, s, thick, col, t)
    self.x = x or WIDTH / 2
    self.y = y or HEIGHT / 2
    self.size = s or WIDTH / 10
    self.thick = thick or WIDTH / 10
    self.time = t or 10
    local ran = function() return math.random(255) end
    self.color = col or color(ran(), ran(), ran(), 255)

    self.paused = false
    self.amnt = 0

    self.tMesh = mesh()
    self.tMesh.vertices = triangulate({vec2(-self.size / 2, -self.size / 2),
                        vec2(-self.size / 2, self.size / 2),
                        vec2(self.size / 2, self.size / 2),
                        vec2(self.size / 2, -self.size / 2)})
--    self.tMesh.shader = shader("Patterns:Arc")
    self.tMesh.shader = shader(arcShader.vprog, arcShader.fprog)
    
    self.tMesh.shader.a1 = math.pi
    self.tMesh.shader.a2 = math.pi
    self.tMesh.shader.pixelSize = 1/s
    self.tMesh.shader.size = .1
    self.tMesh.shader.color = self.color
    self.tMesh.texCoords = triangulate({vec2(0,0),vec2(0,1),vec2(1,1),vec2(1,0)})
end

function Arc:start()
    self.amnt = 0
    if self.timing == nil then
        self.timing = tween(self.time, self, { amnt = 1 }, tween.easing.linear)
    end
end

function Arc:pause()
    if self.timing ~= nil then
        tween.stop(self.timing)
    end

    self.paused = true
end

function Arc:resume()
    if self.timing ~= nil then
        tween.play(self.timing)
    end

    self.paused = false
end

function Arc:stop()
    if self.timing ~= nil then
        tween.stop(self.timing)
        self.timing = nil
    end
end

function Arc:restart()
    self:stop()
    self:start()
end

function Arc:draw()
    -- Update timer
    -- self.tMesh.shader.color = vec4(1 * self.amnt, 1 - (1 * self.amnt), 0, 1)
    self.tMesh.shader.a2 = -self.amnt * (math.pi * 2) + math.pi

    -- Draw timer
    pushMatrix()

    translate(self.x, self.y)

 --   rotate(270.1)

    self.tMesh:draw()

    popMatrix()
end

JakAttak · May 12, 2014, 9:22am

@Jmv38, that picture looks very nice, but on my iPad Mini (no retina display) it is still pixelated

Jmv38 · May 12, 2014, 10:24am

this is weird… Can you change the line

 self.tMesh.shader.pixelSize = 1/s

by

 self.tMesh.shader.pixelSize = 1/s * 2

to increase border by x2? Or try x3,x4 etc… it should look smooth with higher values.
If nothing changes, then that means the function smoothstep is not supported on ipad mini?

bee · May 12, 2014, 7:27pm

@jmv38: I can confirm that your code is still produces pixelated arc on my iPad 1, as it’s not retina as well. Increasing pixelSize makes the circle more blur, but sacrificing its edge sharpness. And the radius line is still pixelated, pixelSize doesn’t help it.

I think this blurness and pixelation effect is because the image scaling. That’s why I hesitated to use this technique since the beginning. Unless someone could show me a technique to minimize it.

bee · May 12, 2014, 7:47pm

Btw, I’m still waiting for polinomial approach from @andrew_stacey. As it doesn’t use expensive trigonometry functions, I hope it would be fast and still maintain beautiful look. \finger crossed\