Compute Builder Particles - 200K sprite rendering example (compute shaders) [updated for beta 349]

https://youtu.be/7nS1PfWlvqo

Compute Builder Particles

Update - 8/8/22: New version, both should work with build 347

V1.0 Compute Builder Particles.zip (2.2 KB)

V2.0 Compute Builder Particles 2.zip (4.4 KB)

Compatible with Build 349
V3.0 Compute Builder Particles.zip (4.4 KB)

  • Multiple attractors
  • Speed-based hue shift
  • Velocity-based stretching
  • Bloom post-processing effect

Description

I’m working on making some compute shader examples and came up with some fairly minimal code to demonstrate this. The above screenshot has 200K tiny Codea logo sprites in it :slight_smile:

This example won’t run in the current 4.x beta build but will in the next build. It’s using the new shader.builder() library to simplify working with shaders in code

It’s using some noise library code for the curl noise, which I will clean up, simplify and document in time

2 Likes

That video is really impressive, does this require the new Apple metal systems or can we run it on the older pads?

It just needs to able to run Codea 4.x, which does require Metal, but Metal has been around since 2015, so it will probably be fine. Might handle less particles but my 2018 iPad pro is pushing 200K sprites at 120fps with no issues

Wow, it’s impressive just how much iPads are capable of. Kind of reminds me of the last time I had ants in the house though :grimacing:

Exciting properties!

It seems that these sytax forms are quite special, quite different from the old OpenGL ES shader, and there are new things to learn.

function setup()
    
    -- Create a reusable shader chunk for the instances buffer that looks like this in GLSL:
    -- struct Instances 
    -- {
    --    vec4 position; // (x, y, vx, vy) position and velocity
    -- };
    instancesChunk = shader.chunk()
        :buffer{"Instance", "instances", access = "readwrite"}
            :vec4("position")
        :done()
    
    -- Create the instanced unlit sprite shader
    instancing = shader.builder()
    :unlit()
    :add(instancesChunk) -- make instances buffer available to shader
    :cull("none"):blend("additive")
    :texture{"sprite", value = asset.builtin.Cargo_Bot.Codea_Icon}
    :vertex
    [[
        // Append instance position :D
        material.worldPosition.xy += instances[gl_InstanceID].position.xy;
        //material.color = vec3(1,0,0);
    ]]
    :material
    [[
        //vec2 vel = instances[gl_InstanceID].position.zw;
        material.baseColor = texture(sprite, getUV0());
        //material.baseColor.rb += vel;
    ]]  
    :build()

can’t wait to test it out with my shader, have you tested running this with max brightness for at least 5 minutes straight?

in my tests on codea 3.x with my shader (which you have seen is very large and lots of conditional branches) i can get 400 on screen sprites at around 100x200 dimensions with alpha overlapping,

after 5 minutes with max brightness the ipad gets very hot and starts to throttle the cpu/gpu as well as reduce the max brightness point, these conditions are not ideal for running an app or game so im always making sure i’m actually running below “max capability”

I haven’t tested it to that degree, although it was getting a little warm after 2-3 minutes. I could try stress testing it with a similar load to what you have, although I kind of expect similar results if this is a GPU bound issue

There’s only so much heat the device can passively shed during heavy loads, and blending hundreds of large translucent sprites could be too much. In my case while there are 200k blended quads on the screen, they are very small and therefore don’t add up to much work (3-4 pixels each). Do you have any examples of similar game visuals you’ve seen running on an iOS device that do not experience this heat build-up? Codea isn’t very efficient due to Lua, so I also wonder about the CPU load as well…

Amazing effect! Compute shader is so much fun! I modified a 3D version (part of 3D) based on @John 's version.

https://youtu.be/YCE8Ybu63JM

– 4 prbloms:
---- 1)The entity.scale property of the entity does not work
---- 2)The instances have no shadow
---- 3)Some instances of the role model are cubes, not clones of the model
---- 4)After clicking the run button, some of the functions that are running are the result of the last compiled code, not the latest current code, which is obvious when an error occurs.It’s like having a compiled executable file stored in the cache that gets preferentially loaded at run time.

I have learned some compute shaders these days. Seeing that compute shader has read-write textures in addition to caches in buffer. Is it supported in Codea 4 ?

How to create a new RWTexture in Codea?

And how to set memory barrier in Codea?

Compute Builder Particles 3D.zip (12.7 KB)

so i tried my shader and it’s not working at first because i had declared instancing in the vertx and spirv? complains about it -


#extension GL_EXT_draw_instanced: enable

hmm ok i’ll just disable this and figure out how to use the instancing you’re using (which might be as simple as using gl_InstanceId instead of gl_InstanceIdEXT, ok but then i get an error that texture2D is not a overload function?

the more i’m looking at the example here the more i get intimidated by how to change it,

@John i know you’re particularly inclined to work on shaders, maybe you can help and convert to the new builder way?
shader.zip (9.1 KB)

@skar texture2D() is the sampler function of OpenGL ES 2.x, and in OpenGL ES 3.x and later versions, this function becomes texture().

GL_EXT_draw_instanced is the syntax for using instances in version OpenGL ES 2.x, and doesn’t need to be specified in OpenGL ES 3.x or later versions, just use the built-in variables gl_InstanceID.

It seems that there are a lot of places where the syntax of OpenGL ES 2.x needs to be changed, or you can try to change the version to 2.x like this:

#version 200

As @binaryblues said, you’ve got some mixed GLSL syntax from different versions of the language. I’ve got a shader cross-compiler that goes from GLSL to Metal, but it isn’t perfect and so complex shaders might run into issues here and there. It’s definitely not a drop-in replacement for the old shader system so your mileage may vary

I’ll try porting it to 4.x and see how it goes (although there are a lot of uniforms and I don’t know what all of them are for)

It’ll be a good stress test for the new shader system, although I would say 400 large/overlapping blended sprites with this enormous heavily branching uber shader is definitely going to stress the GPU (huge instruction count, including loops and dozens of branches). Even if you ran this shader in pure C++ directly ported to Metal, you would still potentially have issues. It may need some simplification / optimisation in the end

thanks for pointing out, i’ll give the different names a try, although i think i need version 310 or higher because i saw a warning about that, but i’ll double check

hmm yes, and as you asked, i don’t know of any other game or app that does hundreds of shaded sprites at once, but i always get confused about this because the ipad can run 3d games where they have shaders on all the texture surfaces of their 3d models. i assume that it operates differently because the gpu figures out how to transform the 3d into pixels and only runs on those pixels where as with 2d we are overlapping constantly causing a pixel to redraw many times, is that correct?

the branches are all different effects and the idea is you can have multiple effects at once, it’s derived from a unity shader i bought and talked to the owner to get it to glsl, would having the unity version help you at all?

You’re right in that typically 3D games mostly render opaque geometry, and this is sorted front to back. The GPU is able to automatically z-reject fragments that won’t be visible due to occlusion. When you draw blended sprites, every sprite draws every fragment regardless of order

Most game engines will use #ifdef blocks to conditionally add/remove sections of code based on which features are currently active so that they do not incur unnecessary shader instructions. The downside is that each possible combination of features must be compiled separately, which is typically done by the engine during the build process. Codea 4.x supports options, which can take the place of boolean uniforms and statically compile features but at the cost of compiling them at runtime

I’ll still try to get the shader working and see just how expensive it is

i don’t know if this is possible but is there a way to track if a pixel has already been given a color and if has a opaque alpha value?

my thinking is if sprites are fed to the shader from closest to the screen to furthest away, and the shader runs on sprite 1, can we effectively discard any future pixel work on the pixels that already have a solid color? so example codea circular logo in the center and a second one 25px higher, after the first logo ran through the shader can we say x,y pixels that the logo covers fully (255 alpha) should be skipped for logo 2

maybe it’s not possible to track each pixel?

@skar It is possible to check each pixel. Would you be doing this for a lot of pixels or just a small area. It can be done with normal code or with shader code so depending on the amount would determine what code you use.

i would want it to run for every pixel of every image that i put, so that could mean hundreds of images on the screen and with very different dimensions, we can imagine something as small as an environmental prop that is 10x10 or a huge backdrop environment asset that is 400x600. And in most cases i also have 1 screen sized background asset so that would be screen height x width.

but with this idea the background asset would never render all of its pixels since most would be covered by other game assets (environment, props, characters, etc)

i’m working on building a world with a day night cycle so every image on the screen will have to have a lighting shader, and i want spells and effects that cast lighting on everything around them, that’s why i come from the perspective of “every sprite should have a shader”

@skar Here’s a shader example I have that gets the color of a pixel. I draw 100 random colored circles on the screen and a smaller white circle in the center. Slide your finger around the screen to move the small circle. When it’s over a colored circle, the whole screen turns the color of that circle. I never learned shaders that well, but it seems to work.

viewer.mode=FULLSCREEN

function setup()
    x,y=WIDTH/2,HEIGHT/2
    
    img=image(WIDTH,HEIGHT)     
    setContext(img)
    background(0)    
    -- draw 100 random colored circles
    for z=1,100 do
        fill(math.random(255),math.random(255),math.random(255),255)
        ellipse(math.random(WIDTH),math.random(HEIGHT),20)
    end    
    setContext()
    
    m=mesh()
    m:addRect(WIDTH/2,HEIGHT/2,WIDTH,HEIGHT)
    m.shader=shader(DS.vS,DS.fS)
    m.shader.xSize=WIDTH
    m.shader.ySize=HEIGHT
    m.shader.xs=0
    m.shader.ys=0
end

function draw() 
    background(0) 
    m:draw()
    fill(255)
    ellipse(x,y,10)   
end

function touched(t)
    if t.state==MOVING then
        x=x+t.deltaX
        y=y+t.deltaY
        m.shader.xs=x/m.shader.xSize
        m.shader.ys=y/m.shader.ySize
    end
end

DS = 
{   vS= 
    [[
        uniform mat4 modelViewProjection;
        attribute vec4 position; 
        attribute vec4 color; 
        attribute vec2 texCoord;
        varying lowp vec4 vColor; 
        varying highp vec2 vTexCoord;  
    
        void main() 
        {   vColor=color;
            vTexCoord = texCoord;
            gl_Position = modelViewProjection * position;
        }   
    ]],
    
    fS = 
    [[
        uniform lowp sampler2D texture;
        varying lowp vec4 vColor; 
        varying highp vec2 vTexCoord;
        uniform highp float xs;
        uniform highp float ys;
        lowp vec4 colr;
        lowp vec4 col = texture2D( texture, vTexCoord) * vColor;
        void main() 
        {   
            // get color of pixel at xs,ys
            colr=texture2D(texture,vec2(xs,ys));
    
            // if white circle is over a colored circle, make screen that color
            if ( colr.r > 0. || colr.g > 0. || colr.b >0. )
               col=colr;
            gl_FragColor = col; 
        }    
    ]]
}

Blending is either enabled or disabled for each drawing operation, so you can’t do it on an individual basis for each pixel (this would be far slower). You can use alpha cutouts via the discard command in the shader to have opaque sprites with hard edges (no soft alpha edges)

@John Today, I am going to use the new syntax, It looks short and clear, In the new syntax, found an error: can not define 2 buffers with the same struct type:

    instancesChunk = shader.chunk()
    :buffer{"Instance", "instances", access = "readwrite"}
    :vec4("position")
    :done()
    
    instancesChunk1 = shader.chunk()
    :buffer{"Instance", "Position", access = "readwrite"}
    :vec4("position")
    :done()

Because in every chunk, Codea will run the below code:

struct Instance
{
    vec4 position;
};

Maybe we need to add a method of define a struct type in the new syntax,or we can not use a new struct type.