using accentuated character

Hello there,

Long time no see (bloody university…) :slight_smile:

I’ve been putting this question aside for a while, but I need it addressed now: is there a way to tell codea to take into account accentuated characters? Every time I tried drawing a string with accentuated characters, I got gibberish. When I tried putting the string in a table, the characters are removed.


Lua only processes one byte at a time, so chars with ASCII values over 127, like accented chars, can’t be used for things like variable names.

See discussion here:

If you just want to use them in text strings, Andrew_Stacey has written a library of functions to handle them, here (ask him if something isn’t clear)

In a W.I.P. encryption program I’m working on I get a similar problem. I coded in UTF-8 support for the encryption algorithm, but parameter.text() doesn’t seem to acknowledge accented characters.

@Rodolphe what are you trying to do? Are you trying to print accented characters on the screen or to use them as variable names. The former is easy, the latter is probably a bad idea.

(Ooops, I didn’t notice I had an answer)

@Andrew_Stacey, I’m trying to print accentuated characters on screen. Cheers! :slight_smile:

@Rodolphe Can you post some code that isn’t working? When I use non-ascii characters then they work just fine.

Something like this, for exemple :

function setup()
    mots = {}
    t = "de l'hôpital au musée"

l = string.len(t)
i = 0
j = 0

for word in string.gmatch(t, "%a+") do
    i = i + 1

for i,v in pairs(mots) do

j = 0
k = 0
l = 0

function draw()
    k = k + 1
    l = l + 1

    m = mots[j+1]
    background(36, 57, 90, 255)
    fill(255, 255, 255, 255)
    if k == 10 then
        k = 0
        if j < #mots then
        j = j + 1
            m = "-"

The problem there is that lua works byte-wise so its patterns aren’t set up to deal with unicode (or non-ascii) characters. Thus the %a is matching bytes which are letters and this doesn’t include the accented characters because these are two bytes (or more).

One option is to use a UTF8 library. I have one which is in this gist as the file utf8.lua. This allows you to iterate over characters rather than bytes.

For the specific code that you’ve posted then there is a simpler solution. If you change %a to %S then it works. This is because %S matches a non-space byte and that includes the bytes in the non-ascii letters.

@Andrew_Stacey Any way to get input from the user, using parameter.text(), with accented characters?

@SkyTheCoder Works for me. Have you tried it?

Yeah… Or maybe the error is from hard-coding in a string with an accented character? (In a table.) I can PM you with the project if you want.

@SkyTheCoder Please do and I’ll take a look.

(btw I sent you a message a week or so ago. Did you get a chance to look at it?)

@Andrew_Stacey I’ll send it.

And yes, I got the message, sorry, but I’ve been really busy with other stuff. I’m working on a reply, though.

@SkyTheCoder No rush - just checking you got it.