How to make applying effects to the images quick?


First. Thanks for the Codea. For me, who doesn’t possess a mac, this app is really helpful.

Second. I’m working on a code in the Codea which includes applying effects to the images. (Lets say grayscale effect, r=g=b=(r+g+b)/3).
The problem is that applying the above equation to every pixel of a, for example, 1024*768 image takes time. It is while in apps like Photo Shop Touch or other image processing apps the effect is immediate.

Can you help me to reduce the time?


Note: English in not my mother tongue. So sorry for any errors.

Hey @None, I don’t know about grayscale, I don’t see any shortcut for that (apart from doing it once and saving the image, or loading it in setup), but a quicker version to tint an image (e.g. Sepia) is to use setContext(img) and draw a rect over the image with a certain fill.

This will be coming soon in the form of shaders with Codea 1.5. Stay tuned for more information.

thnx @Jordan, John, but i just used gray-scale as an example. i actually want to apply “comic effect” to the images (by reducing 1,2,…,255 rgb bands to,for example, 10,20,30,…250 rgb bands). so it is not about shade or tint.

@Jordan, actually I’ve already done what you proposed. i applied effects to the images at setup but since aim of my code is to built up a personal comic book, i have many images and so it really takes time for the code to start is first “page”.

i really wonder how Photo Shop Touch manages to apply those effects so quick?


Hello @None. If you want to do a comic book then

1/ The number of images will be probably too big to load/keep them all on the ipad

2/ you problably want other people to be able to read the book?

So i would suggest that you

1/ precalculate the images once

2/ put them on your web site (some free adress available) or on your dropbox (free) with Codea built in tools

3/ your app should then read strips one after one from you web adress. This is very easy from codea

The long time will be for you once, but then it will be ok.

By the way, how long does it take you to process one 1024x768 image? For me it is in the order of 1s on ipad 1, for some simple processing like what you describe above. If it’s longer for you, then it means there is something wrong in your code. Then you could post it so sby can check if there some obvious improvement.

thnx Jmv38, this is part of my code which applies the comic effect.

(fr= number of frames in a page, for example a comic page with 4 frames in a page

im[i]=name original image for the ith frame

xc[i],yc[i],w[i],h[i]= x center,y center, width and height of ith frame

cimageConverti= page number)

function imageConvert(cimageConverti)
    for i=1,table.maxn(fr) do
        for iw=xc[i]-w[i]/2+1,xc[i]+w[i]/2 do
            for ih=yc[i]-h[i]/2+1,yc[i]+h[i]/2 do
                r,g,b,a = imgo:get(iw,ih)

and i guess 1024*768 takes about 2s for me. if 8 pages, it will lead about 20s, which i guess is long for a user to wait until start of the show.

and, you know, about your “2/ you probably want other people to be able to read the book?” : i actually want people to MAKE the comic book using their own camera picture so uploading pre processed images not a real option.

i know camera access still is not included in the Codea, but I’m looking for the future!

2s is not bad, so there is not much room for improvement. If you let each image appear one after the other, then the page could start appearing after 2s only, which is acceptable. It’s a comic book, so the draw() doesn’t need to be real time: if there is one update every 2s , it sould be ok? You could calculate 1 image, make it appear gradually in 1/2 s (using the tint(255,255,255,a) with a going from 0 to 255), then calculate the next one, make it appear gradually, etc… That would look cool! If you really want real time user interraction while calculating a new image, then there is a solution but is needs some work to understand and implement it (i’ve done it in my planet simulation code), so it might not be useful yet.

thnx Jmv38, good hint about one update every 2s. i’ll try to implement that. if ur planet simulation code is available at this forum i will check it as well.


The code is available in the planet topic, last update. However it’s pretty complex. I’ve modified it to be a bit simpler, but this last one is not posted yet.

@Jmv38, sorry! I don’t know how I could be so stupid! Actually update for 1024*768 takes 10s on ipad2! I, stupidly(!), just felt that is was 2s and didn’t use a watch! (mainly because I waited for several images to be updated and it seemed as they each take just about 2s). But using a watch it is was about 10s. So you think code can be improved?


Then it’s a bit long. I would try two things: 1/ avoid calling unnecessary indexes in a loop. in the following line, replace all the indexations by a local variable that you create before entering the loop


Should be:

Local myimage = img[cimageConverti][i]
Local dx0 = xc[i] - w[i]/2
Local dy0 = (yc[i]-h[i]/2)

It takes a lot of time to run through the table to find the good index: if you do that inside the loop, this is a huge unnecessary cost. You could also avoid the systematic sutraction (iw-dx0) by setting the starting and ending values to be correct in the for statement. Another thing, probably minor: 2/ the function modf() returns 2 values, but you use only one. I would replace it by floor() which does your posterize effect and is probably faster.

@ Jmv38, thank you. your comments are really helpful. i will implement all of them to check the result.

But can you help me for my main question: how does apps like Photo Shop Touch manege to apply these effects almost immediately?

You know, i don’t posses a Mac and i am really new in coding for iOS, but as i think Xcode is direct language to develop iOS apps, are apps like PST quick because they are coded directly in Xcode?

if so will the speed increase if i run my code as a native app? (you know, i have recently downloaded an app called Pyhtonista which have Xcode export).

thank you again for your patience with a newbie!

.@None scripting languages, in general, will be slower than native code for iterating over large 2D data structures. Both Pythonista and Codea will be slow in this regard.

Apps like Photoshop Touch likely utilise the GPU for certain image effects (either using Core Image, or their own system), as well as serious optimisations for CPU-driven effects.

Codea is getting GLSL shaders in the next version (1.5) which makes many image editing operations (such as, for example, inverting an image or converting it to black-and-white) capable of running at 60 frames-per-second on large images.

EDIT: Also, using 2D loops, that is:

for x in 1, width do
    for y in 1, height do


Might be much slower than a 1D loop over the same data, e.g:

local x, y
local size = width * height
local flr = math.floor

for i in 1,size do
    x = flr( (i-1)%width ) + 1
    y = flr( (i-1)/width ) + 1

    -- Use x,y as normal


.@simeon whow! That will be fantastic! And will it be the same for setContext(img) with respectto the speed? Will there still be a large performance drop between drawing in the display or in an image?

.@Jmv38 setContext will always add some overhead. We’ve improved the speed in 1.5 so that it never actually writes out an image unless you attempt to access the image pixels — this was by far the slowest part of setContext, writing out the image data. So we now avoid that in the most common use cases (where you just want to render to image, and then render that image).

Shaders will work as part of mesh(). So any mesh with a shader applied will use that shader to transform its vertices and pixels as it is rendered to the screen. These transformations will happen in parallel on the GPU, which is as fast as you can get really.

@ Simeon, thank you. your hint about 1D was really great. i will try that. and image editing operations capable of running at 60 fps: WOW!

And you know,it is really supporting when you have access to a forum in which you can have so great hints during just 24 hrs!

.@simeon thanks for your answer. I tried your 1d suggestion, but i don’t confirm: the 1d loop is much longer, at least with this program:

-- faster loop

-- Use this function to perform your initial setup
function setup()
    width,height = 4000,4000

    local i,j,dummy
    for i = 1,height do
        for j=1,width do
            dummy = i+j
    print("2d loop : " .. os.clock() - t0 .." seconds")

    local x, y
    local size = width * height
    local flr = math.floor
    for i=1,size do
        x = flr( (i-1)%width ) + 1
        y = flr( (i-1)/width ) + 1
           dummy = x+y
    print("1d loop : " .. os.clock() - t0 .." seconds")

function draw()


i get:
2d loop : 3.52072 seconds
1d loop : 27.202 seconds
The extra calculations are really costly...

@ Jmv38. I guess if 1D loop and using local data doesn’t optimize significantly, I have to use another scheme based on your comments before:

My regular loop for the comic is like that:

  • 1: fade in a full screen normal image

  • 2: showing some text on that (as if someone in typing)

  • 3: wait

  • 4: gradually crop the full screen image to the desired frame (for example 1/2 screen frame)

  • 5: fadeout normal image / fade in comic effected image with size of the frame

  • Go to 1 for other image

My scheme is to calculate the comic effected image on the background while 1-4 and 6 are running. Background calculation should be such that 1-4 and 6 look smooth while when code reaches 5, calculation has been finished.

I will work on that and the hints you and Simon suggested. But as I’m so busy I guess it will take a week for me to do so!

.@Jmv38 thanks for confirming that — I guess it doesn’t apply to this situation. I thought the for-loop overhead might be expensive, but it seems not.

I’ve just tried some more examples with jmv38’s code.

First, the good news: a single for-loop is faster than a double for-loop.

Next, the bad news: the difference is so insignificant that it is wiped out if you have to do anything to compensate for the use of a single loop instead of a double loop.

With identical code inside the loop I get the following figures:

  1. Double loop: 1.91815s
  2. Single loop: 1.91681s

(total iterations the same as jmv38: 16e6; I’m on an iPad2)

So that’s a gain of less than 0.01%

Now suppose I really want to get at the variables in the double loop. So something like the original dummy = i + j. (Note: there’s no need to make i and j local as that’s automatic inside the double loop, and dummy should be made local.) The double loop simply has dummy = i + j. The single loop has to compute the i and j. There are a variety of ways to do this. The code in jmv38’s example is very expensive. A cheaper way to do it is with local variables x and y which are initially set to 1 and 1 and an auxiliary (local) variable w set to width + 1; then in the loop (at the end) we do:

x = x + 1
if x == w then
    y = y + 1
    x = 1

The times are now:

  1. Double loop: 1.9252s
  2. Single loop: 4.13666s

The above might seem a little wasteful as we always do x = x + 1 and sometimes throw away the result. Surely an if ... then ... else ... end would be better!

if x == w then
    y = y + 1
    x = 1
    x = x + 1

(here w is set to width) gives 4.38278s! So we lose .2s by introducing an else branch into our conditional. In fact, removing the conditional altogether (whence just having the x = x + 1 statement - whence the loops are no longer functionally equivalent) drops the time to 2.4726s. So the conditional is adding over a second to the time whilst the addition operation is adding only half a second. We can narrow this down even further by simply putting in a conditional and nothing else:

if x == w then

(so no increment on x) gives 3.57404s.

I wondered about optimising in other ways. How about caching the values of i+j and simply using a look-up? The point of this would be that if you are going to do the same loop lots of times it might be quicker to save the results on the first loop and reuse them. Sadly, no. Storing i + j in a table and then putting dummy = t[i] in the loop does not help: I got 3.40796s for that (just for the look-up loop, that is).

Now there are lots of ways to compute x + y from i (where x = i%width + 1 and y = math.floor(i/width) + 1) using a variety of tests and operations. I’ve yet to find one that brings the total computation down to anywhere near comparable to the double loop.

However, this experimenting does say that it is worth considering alternative ways to do necessary computations inside the loops. For example, % is more expensive than / by a long way, but using math.floor even more so (even when localised).