Image compression

Mark · January 24, 2012, 4:49pm

Just fiddling around, as this is an issue that has been tackled by far bigger brains than mine (including at least a couple of big brains in the Codea community). However, I thought I’d give it a whirl.

Here’s a first stab at a compressed format:

16,16,6,123,7F7E7FFF,ACA7A4FF,453740FF,FF0000FF,7E857EFF,000000FF,027001,028001...0FC002,106006

In this example, the first values of the image are encoded as decimal numbers – image width, image height, number of colors, and number of pixels. So this is a 16x16 image with 6 different colors and a total of 123 non-transparent pixels.

After that, the code switches to hexidecimal, if for no better reason than 255 encodes neatly as FF in hex. Immediately following number of pix are the colors for this image, each coded as an 8-character hex string. For example, FF0000FF works out to solid bright red. Following the color definitions are the pixels of the image, each of which is encoded as a six-character hex string. The first three characters of this string represent the pixel location, the second three represent the color. Why three hex values each? Well, Spreitely supports images up to 32x32 pixels, meaning that there are potentially 1,024 pixels and (less likely but possibly) just as many colors. That’s too many to encode on two hex values (max of 256) but it fits easily in three (max of 4,096). So this format will currently support images up to 64x64, even if every pixel has a unique color; though if every pixel DOES have a unique color, the compression will be minimal.

Even stripping all commas, encoding the image in the normal way would mean 16 characters (4 for location, 12 for color) this way, it takes 6 for each pixel, plus another 8 for each color. In the case of this image, that’s a savings of better than 60%.

There are a comple of easy things I can think of doing with this:

I can make the pixel encoding flexible, so that the number of hex characters dedicated to position and color varies based on the size of the image and pallete. In the example above, each pixel could have really be represented by only 3 hex characters, with the first two going to position and the last to color selection.
I could add simple run length encoding by using one character at the end of each group to represent the number of continuous characters of the same color. Not particularly valuable for something like a photo, but generally effective against icons and game art where the same colors cover good portions of the image.

I also thought about moving up to some greater base value than hex, but it doesn’t really seem worth it this side of 255, and that would mean the resulting string contained a lot of unprintable values or characters that could affect their use in the editor. In any case, here’s the current version of encode and decode. Send encode an image, get a string. Send decode a string, get an image.

function compressImage(img)
    -- create string form of the image
    local x, y, numclrs, numpix
    local c, r, g, b, a, i, found, tc, p, s
    numclrs = 0
    numpix = 0
    c = {}
    p = {}
    -- scan image for colors used
    for x = 1, img.width do
        for y = 1, img.height do
            r, g, b, a = img:get(x, y)
            if a > 0 then
                numpix = numpix + 1
                found = false
                for i = 1, numclrs do
                    if r == c[i].r and g == c[i].g and
                        b == c[i].b and a == c[i].a then 
                        found = true
                        p[numpix] = vec3(x, y, i)
                    end
                end
                if not found then
                    --add color
                    numclrs = numclrs + 1
                    c[numclrs] = color(r, g, b, a)
                    p[numpix] = vec3(x, y, numclrs)
                end
            end
        end
    end
    s = img.width..","..img.height..","
    ..numclrs..","..numpix..","
    for i = 1, numclrs do
        s = s..hex(c[i].r, 2)..hex(c[i].g, 2)
        ..hex(c[i].b, 2)..hex(c[i].a, 2)..","
    end
    for i = 1, numpix do
        s = s..hex(p[i].x * img.width + p[i].y, 3)..hex(p[i].z, 3)
        if i < numpix then
            s = s..","
        end
    end
    return s
end

function decompressImage(s)
    -- retrieve a compressed image
    local x, y, numclrs, numpix, r, g, b, a
    local w, i, c, p, img, clr, k
    c = {}
    p = {}
    count = 0
    for k in string.gmatch(s,"([^,]+)") do
        count = count + 1
        if count == 1 then
            w = k
        elseif count == 2 then
            img = image(w, k)
        elseif count == 3 then
            numclrs = k
        elseif count == 4 then
            numpix = k
        elseif count >= 5 and count < 5 + numclrs then
            -- unpack a color
            r = string.sub(k, 1, 2)
            r = tonumber(r, 16)
            g = string.sub(k, 3, 4)
            g = tonumber(g, 16)
            b = string.sub(k, 5, 6)
            b = tonumber(b, 16)
            a = string.sub(k, 7, 8)
            a = tonumber(a, 16)
            c[count - 4] = color(r, g, b, a)
        elseif count >= 5 + numclrs then
            --unpack pixel
            i = string.sub(k, 1, 3)
            i = tonumber(i, 16)
            x = math.floor(i / img.width)
            y = i - x * img.width --+ 1
            if y == 0 then
                x = x - 1
                y = img.height
            end
            clr = string.sub(k, 4, 6)
            clr = tonumber(clr, 16)
            img:set(x, y, c[clr])
        end
    end
    return img
end

function hex(dec, len)
    local b, k, out, i, d = 16, "0123456789ABCDEF", "", 0
    while dec > 0 do
        i = i + 1
        dec, d = math.floor(dec / b), math.mod(dec, b) + 1
        out = string.sub(k, d, d)..out
    end
    while string.len(out) < len do out = "0"..out end
    return out
end

ruilov · January 24, 2012, 7:33pm

a couple of suggestions:

define a background color. A transparent pixel will be of that color
since you list the colors, instead of using 3 hex digits to represent the color position, use less if the number of colors is low. So up to 16 colors, you can use only one. The number of digits used is implied by the number of colors
instead of using chars/hex, use ascii code. That way instead of 16 choices per byte, you get 256.

ruilov · January 24, 2012, 7:47pm

oh actually on bullet 2, just do:

color1: a bunch of positions that use color1

color2: a bunch of positions that use color2

Then you don’t have to index the color anymore. Could even take the idea further and say

color1:

x1: a bunch of y coordinates

x2: a bunch of y coordinates

…

I think database people call this normalizing the data or something like that. I don’t know, I’m not one of them.

Zoyt · January 24, 2012, 7:50pm

Haha. You two got me flustered. I do make myself go to sleep trying to think of different methods of encoding. I came of with a good one for 4 color numbers of 0 - 255, but I forgot it. Should’ve written it down. Shoulda, woulda, coulda.

Bortels · January 24, 2012, 10:16pm

I’m torn.

On the one hand - I so want to do this. Way back in the very early 90’s, I wrote a .GIF decoder, going to the local university library to look up an IEEE article on LZW compression. So - I’m there.

But - this is also a solved problem, very much. Codea (and the iPad) supports png graphics, and encoders/decoders for that, gzip, and base64 (so we can make it cut and pasteable) are readily available. I spent an afternoon working on simple RLE encoding - and it was an order of magnitude larger than the equivalent PNG file.

So I’m hoping TLL adds them - soon.

Mark · January 25, 2012, 2:09am

@Bortels Oh, absolutely. I realize the futility of my dabbing at the edges of a problem that’s been addressed by not just individual experts in mathematics and information theory, but whole teams of them working over years. I know that at the very best I’ll carve a crude chariot wheel while there are F1 cars zooming past.

It’s just an exercise–what’s the best I can think up to give me a graphics format that’s not just compact, but easily incorporated as text within code. Oh yeah, and simple enough for me to encode and decode without breaking too many neurons.

Ipad41001 · January 25, 2012, 6:42am

Unless you make it yourself, it can’t truly be yours.

http://www.journalpioneer.com/media/photos/unis/2011/10/07/photo_1872036_resize.jpg

Blanchot · January 25, 2012, 7:05am

   Unless you make it yourself, it can't truly be yours.

Oh how true!

Bortels · January 25, 2012, 4:06pm

Well - as an EXERCISE, I wholly support it!

I’d suggest googling RLE - Run Length Encoding. It’s pretty straightforward, and can reduce size tremendously when your images have large regions of single colors. It’s also a fundamental part of Windows .BMP files, so there’s a real -world tie-in.

But my real purpose was not so much to discourage as to beg TLL to get us png encode/decode, zlib, and base64 support. (I can do base64 in lua - but it’s way faster if it’s native)