Here comes a hard part of Lua… String matching. Very confusing for me, but I finally found how to do it from multiple Google searches.
The main thing I’ll be covering here is string.match. Depending on the string you supply, it can return multiple variables.
-- String to extract from Pattern to match
local a, b = string.match("blah some text blah 47 blah", "blah (.-) blah (d-) blah") -- Returns "some text", 47
Complicated? Yes.
The first string, to extract from, can be anything you want. The pattern is the complicated part. First is “blah”, as you can see in the first string, then (.-)
. What’s this? It’s the first capture. You put parenthesis around something you want to be returned as a value. .
means a pattern. The -
means shortest match. *
means longest match. Say you had blah text blah text blah
. If you searched blah (.-) blah
, it would return text
. If it was blah (.*) blah
, it would return text blah text
. Since there are two spots where it contains some blah
, some text, and then blah
again, -
or *
matters. Since -
is shortest, the first time it finds an end to the pattern, it stops there. *
would keep on going until it finds a point where the pattern doesn’t match.
Since it’s looking for blah (.-) blah
, it finds blah text blah
, which matches. text
qualifies as .-
. Since it’s in parenthesis, it means it should be returned as a variable from string.match().
Then we have (d-)
. d
is a number, so it looks for a number. Again, it’s in parenthesis and should be returned as a variable. And again, it uses the -
suffix to mean the shortest match.
If you’re using a non-alphabet character, such as ( or ), you should put a % before it so the function knows you mean it as a character, not as a pattern to match. And if you want to say % as part of the string? Use %% instead.
If you use % on an alphabetical character, it works kind of like a pattern match, but doesn’t return a variable. Like, in the string pattern to match (the second argument of string.match), if you used %d-
it would mean “any number can be in the place of %d-, I’m not sure which, any is fine, just only replace it with a number, and don’t return it as a variable.”
Once you have all that down, it’s pretty much the same, except for the different kinds of captures you can use instead of .
or d
. As far as I know:
. is all characters
d is a decimal digit
s is a space character (" ", not sure what else)
x is a hexidecimal digit (hex color)
u is an uppercase letter
a is any letters
c is control characters (???)
l is a lowercase letter
w is an alphanumeric character (???)
z is a character with a "representation of zero" (???)
f - unsure, called a "frontier" pattern, more info here: http://lua-users.org/wiki/FrontierPattern
bxy - unsure, something about parenthesis, I think
More info I’ve found:
http://stackoverflow.com/questions/2693334/lua-pattern-matching-vs-regular-expressions
http://www.lua.org/pil/20.2.html
Also, there’s string.gmatch, which is close to string.match, but is an iterator:
for (variables) in string.gmatch(stringtosearch, patterntomatch) do
...
end
It just goes through the string to search like string.match, but won’t stop the first time, just keep going through the string until it finds the end, each time it matches the pattern calling the code in the for loop with the variables you asked it to return.
Sorry about the massive post, this is a massive topic…