Fastest 3D angle/axis rotation

daveedvdv · February 15, 2014, 9:56am

BTW, the JavaScript vs. Lua comparisons are a bit bizarre because for JavaScript they use JIT compilers (like V8), whereas for Lua they use the interpreter (LuaJIT makes for a better comparison, I think).

Andrew_Stacey · February 17, 2014, 3:54pm

@daveedvdv I’ve only just seen your comment about the __add method for vec4s (I don’t normally check the activity page, and I don’t get notified about it).

The intention is to allow addition of a quaternion and a number, promoting the number to the corresponding real quaternion. The normal addition of vec4s should still be there and corresponds to addition of quaternions. So providing that I coded it correctly (which might not have happened), there should be no change to existing programs in that the sum of two vec4s is still as it was before. There is just additional functionality which can be used or not.

daveedvdv · February 17, 2014, 4:26pm

@Andrew_Stacey: Sorry about the misplaced message.

I think I was worried about independent additions that make v+s mean v+I*s where I is an identity transformation.

Since then I ran into those performance surprises while benchmarking the v.x * a + v.y * b + v.z * c expression above. So, for myself, I plan to stay away from modifying entries of metatables for intrinsics.

It would be really nice if we could ask Lua (Codea) for a copy of an intrinsic type with a copy of its metatable (i.e., not the original metatable) or even if we could install a fresh metatable from within Lua like we can with tables. We’d then have a way to create types like quaternion from vec4 without affecting the vec4 interface. I think it’s actually pretty easy to implement if you have the source to Lua and its additions. I might add that to the issue tracker some day, but I think I have exceeded my goodwill requests already

daveedvdv · February 17, 2014, 8:35pm

Hmmm, regarding:

It would be really nice if we could ask Lua (Codea) for a copy of an intrinsic type with a copy of its metatable (i.e., not the original metatable) or even if we could install a fresh metatable from within Lua like we can with tables.

Apparently the latter can be done with debug.setmetatable(…). I might try something with that this week…

Ignatz · February 17, 2014, 10:35pm

@daveedvdv - I guess you’ve seen the wiki link to the official Lua docs, with a chapter on metatables

http://www.lua.org/pil/contents.html

Andrew_Stacey · February 18, 2014, 3:28am

@daveedvdv So you want to create a “virtual userdata” built on top of vec4 but which leaves vec4 as it is. So if I write q = quaternion(a,b,c,d) then q is a vec4 but with the extra methods available to quaternions, whilst if I write v = vec4(a,b,c,d) then v is a vec4 without those extra methods.

I think that the difficulty is that all vec4s share the same metatable, so changes to one affect everything. You’d have to divorce the quaternionic vec4 from its metatable and assign a new one. Although debug.setmetatable allows you to assign metatables to things other than tables, it still seems to suffer from the problem of uniqueness of metatables.

Ignatz · February 18, 2014, 3:33am

@Andrew_Stacey - why does a quaternion need to be a vec4? Can’t it just be a table containing four values?

Then you can do what you like with it, including labelling the items w,x,y,z in quaternion style, instead of the vec4 style of x,y,z,w.

Andrew_Stacey · February 18, 2014, 4:09am

@Ignatz In my original version, a quaternion was a table (well, an instance of a class). I decided to make it a vec4 because so many of the functions of the quaternion were simply “Do this on the underlying vec4s”.

The other reason is that while one might want to separate quaternions from vec4s, you can’t separate vec3s from vec3s. So if I want to allow a quaternion to act on a vec3, I have to mess with the vec3 metatable. Once I’m doing that then I may as well mess with the vec4 metatable directly.

The only issue that I can think of with making vec4s in to quaternions is that there’s a bit of extra checking when applying the functions. So when doing v + u then it checks to see if either v or u is a number before doing the addition. But I suspect that if one is at this point in optimisation then one is at the point of replacing inline operations by explicit functions, and then one can use the original functions for vec4s without the checking.

Let me underline that. I cannot think of a single operation on vec4s where the quaternion result is different to the vec4 result. So making vec4s in to quaternions only extends their functionality and doesn’t change any results. Other than that, it’s purely a question of speed.

daveedvdv · February 18, 2014, 10:07am

@Andrew_Stacey:

You’d have to divorce the quaternionic vec4 from its metatable and assign a new one.

Exactly.

Although debug.setmetatable allows you to assign metatables to things other than tables, it still seems to suffer from the problem of uniqueness of metatables.

I’m not sure what you mean by that, but here is what I’m planning to try next time I get some Codea time (sketch; no actual code per se):

    quaternion_meta = copy_of_table(debug.getmetatable(vec4())
    quaternion_archetype = vec4(1, 0, 0, 0)
    debug.setmetatable(quaternion_archetype, quaternion_meta)

    function quaternion(x, y, z, w)
      local q = vec4(x, y, z, w)
      debug.setmetatable(q, quaternion_meta)
    end

    ...

Quaternion creation would be a bit slower than vec4 creation (though initialization via the archetype might be feasible in many cases), but at least vec4 will remain unaffected.

If that turns out to be impractical, I’ll adapt your library to drop support for operator notation and just live with ordinary function call interfaces.

daveedvdv · February 18, 2014, 10:11am

@Ignatz:

why does a quaternion need to be a vec4? Can’t it just be a table containing four values?

While that’s possible, it would result in a great loss of speed and increase in storage costs.

daveedvdv · February 18, 2014, 10:20am

@Andrew_Stacey:

The only issue that I can think of with making vec4s in to quaternions is that there’s a bit of extra checking when applying the functions. So when doing v + u then it checks to see if either v or u is a number before doing the addition.

It appears that that dynamic checking is actually quite significant in overall vec3/vec4 basic operation costs. It’s especially surprising because code that doesn’t deal with the quaternionic character of vec4s (or even just deals with vec3s) ends up paying a substantial price.

But I suspect that if one is at this point in optimisation then one is at the point of replacing inline operations by explicit functions, and then one can use the original functions for vec4s without the checking.

That’s a lot of work though Including tracking down every significant use of vec3/vec4 operator notation in libraries.

Let me underline that. I cannot think of a single operation on vec4s where the quaternion result is different to the vec4 result.

OTOH, it’s not clear to me that for example v4*w4 should result in a Hamilton product by default (e.g., written without knowledge of a default quaternionic character). My preference would be that it actually be an element-wise operation (i.e., that it would result in (v4.x*w4.x, v4.y*w4.y, ...).

daveedvdv · February 18, 2014, 11:01am

Using debug.setmetatable as I suggested above doesn’t work because methods like vec4:len appear to check explicitly that the user data passed in has the original metatable :-(. (Makes sense to avoid memory corruption, but still unfortunate in some way…)

Andrew_Stacey · February 19, 2014, 3:29am

@daveedvdv I also tried using the (undocumented) newproxy command (search for it if you want to know). This seemed to run in to the same problem.

Back to the multiplication, I’m trying to work out when one would want to component-wise multiply 4-vectors and I can’t think of a situation where one would. In fact, I’ve yet to come up with a reason for using 4-vectors at all except as quaternions. What’s your use-case for them?

daveedvdv · February 19, 2014, 8:39am

I’ve used them to represent diagonal matrices. Also just as an efficient way to perform many multiplications.

daveedvdv · February 19, 2014, 8:42am

@Andrew_Stacey: Thanks for the the newproxy pointer!

Andrew_Stacey · February 21, 2014, 9:52am

@daveedvdv I don’t see how they can be efficient for performing many multiplications, unless by “efficient” you mean “concise”. Since there is no native v * u method for vec4s, you’d have to write your own which would expand them out, do the multiplications, then repack them in a new vec4. That seems more inefficient than just doing the multiplications.

With regard to diagonal matrices, then again I want to know what you are using vec4s to represent. If a vec4 is a diagonal matrix then it is acting on vec4s and those presumably represent points somewhere. But what points? The natural dimension of objects is 3, so I’d expect to be seeing vec3s not vec4s. (And if you look at my code, then vec3 * vec3 does do coordinate-wise multiplication.) The face that vec4s are used for points internally does not mean that you should try to multiply one by a diagonal matrix. The effect of multiplying by (a,b,c,d) is exactly the same as multiplying the 3-vector by (a/d,b/d,c/d) since we’re using homogeneous coordinates.

So neither of your uses is, as yet, convincing.

daveedvdv · February 21, 2014, 6:03pm

@Andrew_Stacey: I was speaking hypothetically. (In fact, I think I filed an enhancement request for having the element-wise operators last week.) vec2/vec3/vec4 are built-in data structures, and therefore IMO should not be assumed to be interpreted one particular way, even if one of those interpretations seems more obvious at first.

That said, I don’t really want to convince you. I happen to think it’s not a great idea (cfr. the surprise performance loss with vec3 re-dispatch), but programming style is ultimately a matter of taste.

Jmv38 · August 23, 2014, 11:59am

i copied the code of daveedvdv in 1 single post (instead of 2) for my showcase

-- MeshTest

function applyMatrixRowsToVec3(v, r1, r2, r3)
-- Let M be a matrix with rows r1, r2, and r3. Return M*v.
    return vec3(v:dot(r1), v:dot(r2), v:dot(r3))
end


function rotationVectors(u, angle)
-- Returns three 3D vectors corresponding to the rows of a matrix
-- implementing a rotation of the given angle around the given axis.
    local m = matrix()
    m = m:rotate(angle, u.x, u.y, u.z)
    return  vec3(m[1], m[2], m[3]),
            vec3(m[5], m[6], m[7]),
            vec3(m[9], m[10], m[11])
end

function rotateVec3(v, u, angle)
--  Return vector v rotated angle degrees around vector u.
    return applyMatrixRowsToVec3(v, rotationVectors(u, angle))
end


function sweepHollowCylinderMesh(basePoint, axis, radius, aN, cN)
-- This function creates a mesh of triangles that form a cylinder (without
-- tops). The mesh' vertices and normals are set, but not its colors.
-- This returns the mesh and its number of vertices.
-- basePoint (vec3): One end of the cylinder's axis
-- axis (vec3): The length and direction of the axis
-- radius (vec3 or number): Either the vector swept around the axis to
--                          describe the cylinder, or just a numerical radius
-- aN, cN: The number of steps in the axial and circular sweeps, respectively.

    -- Create the mesh and pre-size its position and normal buffers.
    local m = mesh()
    local bufLen = 6*cN*aN
    local posBuf = m:buffer("position")
    posBuf:resize(bufLen)
    local normBuf = m:buffer("normal")
    normBuf:resize(bufLen)
    -- If radius is given as a length, turn it into a vector orthogonal
    -- to the axis.
    if type(radius) == "number" then
        local r = radius
        local ax, ay, az = axis.x, axis.y, axis.z
        -- Beware of catastrophic cancelation
        if az >= math.abs(ax) and az >= math.abs(ay) then
            radius = vec3(az, az, -ax-ay)
        elseif ay >= math.abs(ax) and ay >= math.abs(az) then
            radius = vec3(ay, -ax-az, ay)
        elseif ay >= math.abs(ax) and ay >= math.abs(az) then
            radius = vec3(-ay-az, ax, ax)
        end
        radius = r*radius:normalize()
    end
    -- Compute the incremental rotation vector and the incremental
    -- axial sliding vector for sweeping the mesh.
    local rot1, rot2, rot3 = rotationVectors(axis, 360/cN)
    axis = axis/aN
    -- Sweep the mesh with axial sliding in the inner loop since that
    -- is more efficient than the rotational sweep.
    local i = 1 -- Index into the buffers
    local r1, r2, b1, b2
    r1 = radius
    for c = 1, cN do
        -- Rotate the radius vector, but ensure that the last one is
        -- exactly the one we started with.
        if c == cN then
            r2 = radius
        else
            r2 = applyMatrixRowsToVec3(r1, rot1, rot2, rot3)
        end
        b1 = basePoint
        for a = 1, aN do
            b2 = b1 + axis
            -- Create two triangles:
            local p2 = b1+r2
            local p3 = b2+r1
            posBuf[i] = b1+r1
            posBuf[i+1] = p2
            posBuf[i+2] = p3
            posBuf[i+3] = p2
            posBuf[i+4] = p3
            posBuf[i+5] = b2+r2
            normBuf[i] = r1
            normBuf[i+1] = r2
            normBuf[i+2] = r1
            normBuf[i+3] = r2
            normBuf[i+4] = r1
            normBuf[i+5] = r2
            i = i + 6
            b1 = b2
        end
        r1 = r2
    end
    return m, bufLen
end


function setup()
    basePt = vec3(0, -50, 0)
    axis = vec3(10, 100, 10)
    radius = 50
    hN = 2; cN = 24
    m, nVertices = sweepHollowCylinderMesh(basePt, axis, radius, hN, cN)
    colorBuf = m:buffer("color")
    colorBuf:resize(nVertices)
    for n = 1, nVertices, 6 do
        local c = color(240+math.random(-15, 15))
        for i = n, n+5 do
            colorBuf[i] = c
        end
    end
    camPos = { angle = 0 }
    tween(12, camPos, { angle = 2*math.pi }, { loop = tween.loop.forever})
end


function draw()
    -- Clear screen to black.
    background(0, 0, 0, 255)
    pushMatrix()
    perspective(45, WIDTH/HEIGHT, 200, -200)
    camera(0, 400*math.cos(camPos.angle), 400*math.sin(camPos.angle),
           0, 0, 0,
           0, math.cos(camPos.angle+math.pi/2), math.sin(camPos.angle+math.pi/2))
    m:draw()
    popMatrix()
end