Faster Line Drawing

Hi guys, can the vertex shader be used to provide faster line drawing than the standard line()?

I’ve written a ray tracing program which generates thousands of short lines. I need to be able to draw these lines but this is pretty slow in standard Codea/lua. I’ve done some benchmarking and on my v3 iPad Codea seems to draw just over 5000 lines/sec, independent of line length.

Can anyone suggest ways of speeding this up?

Thanks.

Use meshes, make a thin rectangle using either mesh:addRect or make your own vertices and rotate it between two points?

Hello @SPM1963. As @Luatee suggests, I think you can get a 20-fold speed increase by using a mesh. For example:

-- WARNING: Produces flashing output

function setup()
    strokeWidth(2.5)
    fill(255, 255, 0)
    fontSize(64)
    parameter.integer("n", 1, 5000, 5000, changed)
end

function draw()
    background(0)
    local m = mesh()
    -- Cache functions
    local random = math.random
    local atan2 = math.atan2
    local sqrt = math.sqrt
    for i = 1, n do
        local x1, y1 = random(WIDTH), random(HEIGHT)
        local x2, y2 = random(WIDTH), random(HEIGHT)
        local dx = x2 - x1
        local dy = y2 - y1
        local d = sqrt(dx * dx + dy * dy)
        local a = atan2(dy, dx)
        m:addRect(x1 + dx / 2, y1 + dy / 2, d, 2.5, a)
    end
    m:draw()
    c = c + 1
    local s = tostring(n)..": "..
        string.format("%d", c / (ElapsedTime - t))
    text(s, WIDTH/2, HEIGHT/2)
end

function changed()
    t = ElapsedTime
    c = 0
end

```


compared to:
-- WARNING: Produces flashing output
    
function setup()
    strokeWidth(2.5)
    fill(255, 255, 0)
    fontSize(64)
    parameter.integer("n", 1, 5000, 225, changed)
end

function draw()
    background(0)
    local random = math.random -- cache function
    for i = 1, n do
        local x1, y1 = random(WIDTH), random(HEIGHT)
        local x2, y2 = random(WIDTH), random(HEIGHT)
        line(x1, y1, x2, y2)
    end
    c = c + 1
    local s = tostring(n)..": "..
        string.format("%d", c / (ElapsedTime - t))
    text(s, WIDTH/2, HEIGHT/2)
end

function changed()
    t = ElapsedTime
    c = 0
end

```

@mpilgrem, in your first code example is a memory leak, because of local m = mesh() inside the draw() method.
Actually it should be garbagecollected, but I experienced that its not.

Thanks guys, I hadn’t realised that a mesh could be independent polygons. If you take out the calculations and time just the drawing part then the actual speed up is much more than 20x, around 300x in fact!
I still wish we could have a mesh of lines though. Drawing one line should be even quicker than four triangles - two to make a rectangle and then two more to make a ‘+’ cross-section. Since I’m working in 3d space a simple ‘-’ cross-section would disappear when viewed from in-plane.

Thank you for the warning @se24vad. While I read up on garbage collection in Lua 5.1, here is a (faster) alternative:

-- WARNING: Produces flashing output

function setup()
    strokeWidth(2.5)
    fill(255, 255, 0)
    fontSize(64)
    m = mesh()
    parameter.integer("n", 1, 5000, 5000, changed)
end

function draw()
    background(0)
    m:clear()
    -- Cache functions
    local random = math.random
    local atan2 = math.atan2
    local sqrt = math.sqrt
    for i = 1, n do
        local x1, y1 = random(WIDTH), random(HEIGHT)
        local x2, y2 = random(WIDTH), random(HEIGHT)
        local dx = x2 - x1
        local dy = y2 - y1
        local d = sqrt(dx * dx + dy * dy)
        local a = atan2(dy, dx)
        m:addRect(x1 + dx / 2, y1 + dy / 2, d, 2.5, a)
    end
    m:draw()
    c = c + 1
    local s = tostring(n)..": "..
        string.format("%d", c / (ElapsedTime - t))
    text(s, WIDTH/2, HEIGHT/2)
end

function changed()
    t = ElapsedTime
    c = 0
    m:resize(n * 6) -- 6 vertices per line (rectangle) 
end

```

A little knowledge is a dangerous thing, but my understanding of garbage collection and meshes is now as follows (thanks largely to the source code of the Codea Runtime Library):

Lua 5.1’s garbage collection makes use of the field totalbytes of Lua’s global_State (source file lgc.c). When a new userdata value is created of size s, behind the scenes in the Lua API, luaM_malloc is used to allocate memory of s plus the size of the Udata type (lapi.c and lstring.c):

LUA_API void *lua_newuserdata (lua_State *L, size_t size) {
  Udata *u;
  ...
  u = luaS_newudata(L, size, getcurrenv(L));
  ...
}

Udata *luaS_newudata (lua_State *L, size_t s, Table *e) {
  Udata *u;
  ...
  u = cast(Udata *, luaM_malloc(L, s + sizeof(Udata)));
  ...
}

```


`luaM_malloc` is actually a variety of `luaM_realloc` (`lmem.h`):
#define luaM_malloc(L,t)	luaM_realloc_(L, NULL, 0, (t))

```


and `luaM_realloc` increases `totalbytes` by the size of the memory requested (`osize` being `0`) (`lmem.c`):
void *luaM_realloc_ (lua_State *L, void *block, size_t osize, size_t nsize) {
  ...
  g->totalbytes = (g->totalbytes - osize) + nsize;
  ...
}

```


In the case of a new 'mesh' userdata value, behind the scenes, a new userdata value is created by the Codea API of size `MESH_SIZE`, which is the size of the `mesh_type` type (`mesh.m`):

#define MESH_SIZE     sizeof(mesh_type)

```


`mesh_type` is a relatively small `struct` (`mesh.h`), relative to the size of the data associated with a mesh value. The bulky data of the mesh is reserved using the `initBuffer` function, which makes use of C's speedy `malloc` (not a function that causes Lua's `totalbytes` tracking to increase) (`mesh.m`):

static mesh_type *Pnew(lua_State *L)
{
    mesh_type *meshData = lua_newuserdata(L, MESH_SIZE);
    initBuffer(&meshData->vertices, 3);
    initBuffer(&meshData->colors, 4);
    initBuffer(&meshData->texCoords, 2);
	...
}

static void initBuffer(float_buffer* buffer, size_t elementSize)
{
    buffer->capacity = 3000;
    ...
    buffer->buffer = malloc(buffer->capacity * buffer->elementSize * sizeof(GLfloat));
    ...
}

```


Similarly, re-sizing a mesh's buffers as the mesh grows larger than its initial capacity makes use of C's `realloc`:
static void resizeBuffer(float_buffer* buffer, int newLength)
{    
        ...
        buffer->buffer = realloc(buffer->buffer,
            buffer->capacity * buffer->elementSize * sizeof(GLfloat));
        ...
}

```


The overall result is that `local myMesh = mesh()` can make use of much more memory than the memory that Lua's garbage collection keeps track of using `totalbytes`. As a consequence, it can be necessary to keep track of, or be aware of, the memory use outside of the automatic collection and code accordingly.

My first example above did not take that into account, and, after creating mesh values with 30,000 vertices at a rate of about 20 times a second, soon falls over.