String permutations

I know this is a textbook problem — I’m just unhappy with what I’ve found of textbook answers.

Given a string, I’d like to return all the unique permutations. However, examples that I’ve found often produce duplicate strings, or repeat the use of letters. That’s an issue when dealing with a problem that already requires recursion to avoid massive iteration.

So, for example, given the string “abc”, I’d like to get back abc, acb, bac, bca, cab, cba. But not aaa or aab, etc.

Surely there’s a nice clean implementation that’s eluded my googling. Thanks.

What do you want returned for an input of aab? 3 permutations (aab aba baa) or 6 (aAb Aab abA Aba baA bAa)

The six set. Really, a character array permutation is a clearer statement.

@Mark Here’s a version.

function setup()
    str="abc"
    combination(str,1)
end

function combination(t,v)
    if v==#t then
        print(str)
        return
    end
    for i=v,#t do
        swap(v,i)      
        combination(t,v+1)
        swap(v,i)
    end
end

function swap(x,y)
    tab={}
    for z=1,#str do
        table.insert(tab,string.sub(str,z,z))        
    end
    tab[x],tab[y]=tab[y],tab[x]
    str=table.concat(tab)
end

What @dave1707 said

Here’s a better version to list the permutations for larger string values.

EDIT: I did the string abcdefghij and it showed 3,628,800 permutations. I didn’t scroll thru them to verify all were correct, I’ll leave that to someone else if they want to.
If I did the string abcdefghijklmnopqrstuvwxyz, there would be 403,291,461,126,605,635,584,000,000 permutations.

function setup()
    str="abcdef"
    perm={}
    combination(str,1)
    st=1
end

function combination(t,v)
    if v==#t then
        table.insert(perm,str)
        return
    end
    for i=v,#t do
        swap(v,i)      
        combination(t,v+1)
        swap(v,i)
    end
end

function swap(x,y)
    tab={}
    for z=1,#str do
        table.insert(tab,string.sub(str,z,z))        
    end
    tab[x],tab[y]=tab[y],tab[x]
    str=table.concat(tab)
end

function draw()
    background(0)
    fill(255)
    text("Slide finger to scroll list",200,HEIGHT/2)
    text("Permutations "..#perm,200,HEIGHT/2+50)
    en=math.min(st+HEIGHT//20,#perm)
    for z=st,en do
        text(perm[z],WIDTH/2,HEIGHT-(z-st+1)*20)        
    end
end

function touched(t)
    st=(st+t.deltaY)//1
    if st>#perm-10 then
        st=#perm-10
    end
    if st<1 then
        st=1
    end
end

Thanks, Dave. I dragged the old 178K word dictionary out of ScramWords and am spinning the results of your routine past it to find words-within-words. I was pretty happy with how that old program used the r,g,b channels to snug all those words into. 768x768 image. Now I’m wishing I’d used the a channel to encode grade level. Some of the words popping out are pretty obscure.

This is another good one for coroutines:

function permgen (a, n)
   if n == 0 then
      coroutine.yield(a)
   else
      for i=1,n do
	 
	 -- put i-th element as the last one
	 a[n], a[i] = a[i], a[n]
	 
	 -- generate all permutations of the other elements
	 permgen(a, n - 1)
	 
	 -- restore i-th element
	 a[n], a[i] = a[i], a[n]
	 
      end
   end
end

function perm (a)
   local n = table.getn(a)
   return coroutine.wrap(function () permgen(a, n) end)
end

str = "abcdef"
tbl = {}
while str ~= "" do
   table.insert(tbl,str:sub(1,1))
   str = str:sub(2)
end
for p in perm(tbl) do
   print(table.concat(p," "))
end

I looked up the ScramWords link and saw a post of mine there about a 1025x1025 pixel image that I created that holds 300,249 words.

The Scramwords dictionary is a pretty close match for the Scrabble dictionary, though I compiled it from unofficial sources. Scrabble has 187K words, Scramwords 178k. A big part of that difference comes from two additional lists added to Scrabble since I made the original image.

But honestly, I’m still just pleased that it worked out. Certainly a lot simpler toting around a modest image than a dictionary of strings.

@Mark Is this to play scrabble. If so, then you have to add values to the letters so you can find the words that create the most points.

@Mark If you’re searching for scrabble words, then this code will work better than word permutations. It finds and prints any word in a dictionary file that matches the scrabble letters from 1 character to how many letters you have. For testing, you can change the characters in the variable letters and you can change the words in the table dict to try different letters/words combinations. The table dict should be the dictionary table, one word per table entry. The upper/lower case of the letters must match the case of the dictionary or more code could be added to force the case to match. More code can be added to give a scrabble value to each word found.

function setup() 
    letters="tienvofxs" -- upper/lower case of letters must match case of dictionary
    
    st={}
    for z=1,#letters do
        table.insert(st,string.sub(letters,z,z))
    end
    table.sort(st)

    -- read a dictionary file and create a dict table of words
    dict={} 
    dict={"one","two","three","four","five","six","seven","eight","nine","ten"}
    
    foundTab={}
    for z=1,#dict do
        find(dict[z],st)
    end
    
    print("letters = "..letters)
    print("words that match some letters")
    print(table.concat(foundTab,"\
"))
end

function find(d,s)
    if #d>#s then
        return
    end
    local dt={}
    for z=1,#d do
        table.insert(dt,string.sub(d,z,z))
    end
    table.sort(dt)
    local pos=1
    for z=1,#dt do
        if pos>#s then
            return
        end
        for y=pos,#s do
            if dt[z]==s[y] then
                pos=y+1
                break                
            elseif dt[z]>s[y] then
                if y>=#s then
                    return
                end
            else
                return
            end
        end
    end
    table.insert(foundTab,d)
end

Here’s an updated version that shows the scrabble value of the words matching the letters.

function setup() 
    -- scrabble value of the letters a thru z
    valTab={1,3,3,2,1,4,2,4,1,8,5,1,3,1,1,3,10,1,1,1,1,4,4,8,4,10}
    
    local letters="tienvofxs" -- upper/lower case of letters must match case of dictionary

    local st={}
    for z=1,#letters do
        table.insert(st,string.sub(letters,z,z))
    end
    table.sort(st)

    -- read a dictionary file and create a dict table of words
    local dict={} 
    dict={"one","two","three","four","five","six","seven","eight","nine","ten"}

    foundTab={}
    for z=1,#dict do
        find(dict[z],st)
    end

    print("letters = "..letters)
    print("words that match some letters")
    print(table.concat(foundTab,"\
"))
end

function find(d,s)
    if #d>#s then
        return
    end
    local dt={}
    for z=1,#d do
        table.insert(dt,string.sub(d,z,z))
    end
    table.sort(dt)
    local pos=1
    for z=1,#dt do
        if pos>#s then
            return
        end
        for y=pos,#s do
            if dt[z]==s[y] then
                pos=y+1
                break                
            elseif dt[z]>s[y] then
                if y>=#s then
                    return
                end
            else
                return
            end
        end
    end
    table.insert(foundTab,d.." "..getValue(d))
end

function getValue(word)
    local tot=0
    for z=1,#word do
        tot=tot+valTab[string.byte(string.sub(word,z,z))-string.byte("a")+1]
    end  
    return tot
end

One minor thing: wouldn’t it be more efficient to create a list of sorted words at the start rather than sort them every time you invoke the find function?

@LoopSpace The dictionary file should already be sorted. I do one sort in setup to get the letters in order, then in find(), I sort the characters of each word. Having the letters and the characters of each word in order make it easy to do the compare. That way I can compare multiple characters that are the same without a problem.

Edit: Actually, the dictionary file doesn’t need to be sorted. But if you wanted to speed things up, then you could stop the dictionary search once you past the highest letter you have.

@dave1707 By “sorted words” I meant with each word rearranged into alphabetical order. The point is that the step “in find(), I sort the characters of each word” only needs to be done once, but you’re doing it every time.

@LoopSpace Are you saying that I should sort each word in the dictionary file so that the letters of each word are in alpha order. I could do that easy enough, but if that’s the case then when I find a match, I won’t know what the word is because the letters are in alpha order. I took my code above and used my 300,249 word dictionary file and it found 290 words matching the letters tienvofxs. It took 1.7 seconds which isn’t that long.

Thanks. I’d already cranked a version on my old ScramWords dictionary. BTW, I do something with my dictionary that’s likely heavy abuse of lua tables, but seems to work fine—I just use the words as indexes and fill the table with empty strings. That makes it very simple to determine if a word is present without iteration.

If this ends up in a larger program, 1.7s might end up being too long.

Keep the original dictionary then when a word is found in the alpha-sorted one, return the corresponding entry in the original one.

Even better would be to generate the corresponding pattern, something like a.?d.?n for and, and use string.find since then it’s working at the c-level so will be much faster yet.

To use hashing, as @Mark suggests, you’d be attacking the problem the other way around: generate all substrings of the tiles and look for those. Hash lookup is very fast, and there are only 127 non-empty alpha-ordered substrings of 7 tiles so that could well be fastest.

I did have another silly idea: convert each word to a number by assigning a prime to each letter and multiplying. Then a word can be made from the tiles if its number divides the tiles number.

@LoopSpace I was thinking about converting each word to a number also, but I wasn’t sure if that would work or not so I just went the other way. When I have nothing to do later today, I’ll see if that might work.