@LoopSpace I tried converting words and letters to numbers, but I didn’t see how that would work when I tried match anything. The next thing I tried was sorting the letters in each word and having a dictionary table of words with their sorted letters with each word. It took almost 5 seconds to read the dictionary file and create a table of words and their sorted letters. But on the bright side, instead of taking 1.7 seconds to find the 290 matching words, it took only .3 seconds. The thing is, I don’t think the 1.7 seconds to find matching words is that bad. You have to take into consideration that each time you pick new letters, you have to enter those letters in a input field, tap a button to find matching words, then look thru the list of matched words to determine what word you want to use.
@dave1707 To convert to numbers, you have to choose a prime for each letter and then multiply the numbers together. So cab = 5 x 2 x 3 = 30. Then t is a substring of s if s%t == 0.
Re speed: what about when the computer takes a turn? Then you don’t want the user waiting around doing nothing.
@LoopSpace I’ll play around with what you show above later, but here’s what I think is a good routine. You don’t need to do any sorting at all. You just create the dictionary as a table of words. Using the string.sub function and both the words and letters as strings, you just match a character in one string to a character in the other string. If there’s a match, you just replace the matched character with a ! so it’s not used again. Using this routine on my 300,000+ word dictionary file, it took only .4 seconds to find the 290 words that took 1.7 seconds with the other routine. So you create the dictionary file once and then change the letters with parameter.text and press parameter.find .
function setup()
parameter.text("letters","ivnetxovfsne")
parameter.action("find",find)
found={}
dictionary={"one","two","three","four","five","six","seven","eight","nine","ten"}
for a,b in pairs(dictionary) do
find(b,letters)
end
print("letters = "..letters)
print("matched words")
print(table.concat(found,"\
"))
end
function find(word,letter)
for z=1,#word do
c=string.sub(word,z,z)
letter,n=string.gsub(letter,c,"!",1)
if n==0 then
return
end
end
table.insert(found,word)
end
@LoopSpace Here’s an example of converting words to a number value like you suggested. Using the find routine on my 300,000+ dictionary took .8 seconds to find the 290 words. That was 2x longer then the time using string.gsub. I guess the number for each word could be calculated in advance to speed things up.
function setup()
val={a=2,b=3,c=5,d=7,e=11,f=13,g=17,h=19,i=23,j=29,k=31,l=37,m=41,n=43,
o=47,p=53,q=59,r=61,s=67,t=71,u=73,v=79,w=83,x=89,y=97,z=101}
-- calc value of letters
letters="ivnetxovfsne"
lettersVal=1
for z=1,#letters do
c=string.sub(letters,z,z)
lettersVal=lettersVal*val[c]
end
found={}
dictionary={"one","two","three","four","five","six","seven","eight","nine","ten"}
for a,b in pairs(dictionary) do
find(lettersVal,b)
end
print("letters = "..letters)
print("matched words")
print(table.concat(found,"\
"))
end
function find(letter,word)
local wv=1
for z=1,#word do
wv=wv*val[string.sub(word,z,z)] -- calc value of word
end
if letter%wv==0 then
table.insert(found,word)
end
end
Interesting, but I would expect a significant speed-up by precalculating. I’ll have to load in a dictionary to test these … I’m getting intrigued.
@LoopSpace I tried the precalculate on each word and got a run time of .18 seconds. Quite a speed up.
@dave1707 I’m trying a hash table lookup first. How are you calculating the times? I want to be sure I’m comparing like with like.
This code compares the primes method with a method based on hashing.
-- Dictionary
function setup()
local d = readText("Documents:Dictionary")
local dict = split(d,"[^\
\\r]+")
primes = {2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,
101}
local nd = 0
for k,v in ipairs(dict) do
-- for k=1,100 do
-- v = dict[k]
if v:find(" ") == nil then
nd = nd + 1
end
end
print("Dictionary size: " .. nd)
ndict = {}
sdict = {}
local w,t,e,s,n
s = os.clock()
for k,v in ipairs(dict) do
-- for k=1,100 do
-- v = dict[k]
if v:find(" ") == nil then
t = split(v,".")
table.sort(t)
w = table.concat(t)
if not sdict[w] then
sdict[w] = {}
end
table.insert(sdict[w],v)
end
end
e = os.clock()
print("Hashing: " .. (e-s))
s = os.clock()
for k,v in ipairs(dict) do
-- for k=1,100 do
-- v = dict[k]
if v:find(" ") == nil then
t = split(v,".")
n = 1
for l,u in ipairs(t) do
n = n * primes[u:upper():byte()-64]
end
if not ndict[n] then
ndict[n] = {}
end
table.insert(ndict[n],v)
end
end
e = os.clock()
print("Primes: " .. (e-s))
parameter.text("letters","gtexbju")
parameter.action("doFind",doFind)
--]]
end
--- splits 'str' into pieces matching 'pat', returns them as an array
function split(str,pat)
local tbl={}
str:gsub(pat,function(x) tbl[#tbl+1]=x end)
return tbl
end
function findHash(l)
local w = split(l,".")
table.sort(w)
local i = {}
for k=1,#w do
table.insert(i,0)
end
local r = {}
local j = #w
local s
while j > 0 do
i[j] = i[j] + 1
if i[j] == 2 then
i[j] = 0
j = j - 1
else
j = #w
s = {}
for l=1,#w do
if i[l] == 1 then
table.insert(s,w[l])
end
end
s = table.concat(s)
if sdict[s] then
for k,v in ipairs(sdict[s]) do
table.insert(r,v)
end
end
end
end
return r
end
function findPrime(l)
local t = split(l,".")
local n = 1
for l,u in ipairs(t) do
n = n * primes[u:upper():byte()-64]
end
local r = {}
for k,v in pairs(ndict) do
if n%k == 0 then
for l,u in ipairs(v) do
table.insert(r,u)
end
end
end
return r
end
function doFind()
local s,t,e
s = os.clock()
findHash(letters)
e = os.clock()
t = findHash(letters)
print("Hashing: ")
print(#t .. " words found in " .. (e-s) .. "s")
s = os.clock()
findPrime(letters)
e = os.clock()
t = findPrime(letters)
print("Primes: ")
print(#t .. " words found in " .. (e-s) .. "s")
end
I have a dictionary stored as a text asset. It reports just over 370,000 words.
In terms of initialising, I’m getting about 18s for the hash table and 25s for the primes table. Obviously, this could be stored in the file itself.
When it comes to finding all the words, with 7 letters I get of the order of 0.2s for the primes method, but an amazing 0.001s for the hash method.
I need to double-check my routines. With the letters ivnetxovfsne then the hash method found about 1400 words in less than 0.08s while the prime method only found about 500 words in 0.25s
@LoopSpace Your reusing letters in your hash routine. I put the word one
in the dict table. When I used the letters one
, hash found 1 word. When I entered onee
in letters, hash found 2 words. When I entered oneee
in letters, hash found 3 words. It kept reusing the letters on
for the three e’s. When you said you were going to use hash for the dictionary, I wasn’t sure how that was going to work. But I took your hash code and used just a few words in the dict table. I printed out sdict to see what was there and I saw what you were doing. You sorted the letters in each word and any words that had the same letters were considered the same word with the actual words in another table. I wonder how much the full dictionary table was reduced.
@LoopSpace I ran your hash routine on my dictionary file. The full dictionary table is 300,130 words. The hash table was 269,725 words. So hash reduced the table entry size by 30,405 words. That’s about 10% smaller. So I think there should be about a 10% reduction in speed.
Edit: I used the letters ivnetxovfsne on my 300,130 words and the hash routine found 1,189 words in .04 sec and the prime routine found 436 words in .08 sec.
Edit 2: With the letters ivnetxovfsne, you have 2 v’s, 2 n’s, 2 e’s. I removed the duplicate letters and got 299 matches with both the hash and prime routines.
Good catch!
Here’s my modified search:
function findHash(l)
local w = split(l,".")
table.sort(w)
local i = {}
for k=1,#w do
table.insert(i,0)
end
local r = {}
local j = #w
local s
local f = {}
while j > 0 do
i[j] = i[j] + 1
if i[j] == 2 then
i[j] = 0
j = j - 1
else
j = #w
s = {}
for l=1,#w do
if i[l] == 1 then
table.insert(s,w[l])
end
end
s = table.concat(s)
f[s] = true
end
end
for s,u in pairs(f) do
if sdict[s] then
for k,v in ipairs(sdict[s]) do
table.insert(r,v)
end
end
end
return r
end
With that, I get the same number of matches for the two routines. With the ive… set, I get 528 in .25s for primes and .073 for hash.
My dictionary also shows about a 10% reduction in size.
Loopspace How do you print the actual words found by your routine
@colincooper One way you can print the words in @LoopSpace code above is add the line print(table.concat(t,"\ "))
after the line of code t=findHash(letters)
or t=findPrime(letters)
in the function doFind()
.
Thank you