Help double-check code + tests? (function separating indexes and hash)

UberGoober · August 4, 2021, 4:50pm

I wrote this function to separate a table into its numerically-indexed and hash-table components:

        
        function separateNumericallyIndexedAndHashTablesIn(thisTable)
            local numericalTable, hashTable = {}, {}
            for k, v in pairs(thisTable) do
                print(k, v)
                if type(k) == "number" then
                    numericalTable[k] = v
                else
                    hashTable[k] = v
                end
            end
            return numericalTable, hashTable
        end

And I wrote this code in CodeaUnit to test it (I’m including the print statements I used to debug it);


        _:test("separateNumericallyIndexedAndHashTablesIn(...) returns correct tables", function()
            --create result flags
            local totalResult, numericalCountRight, numericalResult, hashCountRight, hashResult
            --make test table and verification tables to check results against
            local tableForKey = {}
            local testTable = {[1] = "one", [2] = "two", [4] = "four", [10] = "ten", 
                ["red"] = "foo1", ["five"] = "foo2", [tableForKey] = "foo3"}
            local correctNumericals = {[1] = "one", [2] = "two", [4] = "four", [10] = "ten"}
            local correctHash = {["red"] = "foo1", ["five"] = "foo2", [tableForKey] = "foo3"}
            --run the function
            local returnedNumericals, returnedHash = separateNumericallyIndexedAndHashTablesIn(testTable)
            --inspect counts
            local numericalCounter = 0
            for i, v in pairs(returnedNumericals) do
                numericalCounter = numericalCounter + 1
            end
            numericalCountRight = numericalCounter == 4
            print(table.unpack(correctNumericals))
            print(table.unpack(returnedNumericals))
            local hashCounter = 0
            for i, v in pairs(returnedHash) do
                hashCounter = hashCounter + 1
            end
            hashCountRight = hashCounter == 3
            --inspect contents
            if numericalCountRight and hashCountRight then
                numericalResult = true 
                hashResult = true 
                for i, v in pairs(correctNumericals) do
                    if v ~= returnedNumericals[i] then numericalResult = false end
                end
                for i, v in pairs(correctHash) do
                    if v ~= returnedHash[i] then hashResult = false end
                end
            end
            --overall result is AND combination of all results
            print(numericalCountRight, numericalResult, hashCountRight, hashResult)
            totalResult = numericalCountRight and numericalResult and hashCountRight and hashResult
            _:expect(totalResult).is(true)
        end)

…does this look right? Are there any cases you think this would miss?

Bri_G · August 4, 2021, 6:52pm

@UberGoober - I’m sure this will come in handy sometime, thanks for the post. Always looking for any new tools that I can add for debugging (largely because I make a mess of coding!!!)

Just out of interest, started a new project and I thought - ‘can I place a few generic tools in a dependency for that project?’ - which I can hopefully use on other projects. Tried it out but ended up placing specific tools like parameter.watch(). No big deal - I just have a dependency project for each development project. But it does make it easy to switch them on and off by turning your dependency on and off.

RonJeffries · August 4, 2021, 11:50pm

i agree that the code separates numeric from non-numeric keys. i’m not sure if you think it produces an array table and a hashed table, and i very much doubt that it does.

RonJeffries · August 4, 2021, 11:52pm

i’m curious … what purpose have you for this?

UberGoober · August 5, 2021, 12:01am

@RonJeffries I think it will produce an array if the numerical keys are sequential. I think tables with non-sequential numeric keys are called sparse tables in lua. I think this code will handle both.

UberGoober · August 5, 2021, 12:04am

@RonJeffries In point of fact, I wrote this code to test some other code!

I was making a table that had both an array part and a hash part, and I realized the easiest way to test it would be to separate the parts and examine them each on their own.

And then I realized the code to do that could be generalizable, and plus might need testing on its own anyway.

UberGoober · August 5, 2021, 4:04pm

@RonJeffries I think I’ve corrected it so it returns an array table and a hash table.

It now checks for a key being an integer and not just a number.

If I’m wrong, could you suggest a test I could add to make it fail?

New function:


        function separateArrayAndHashTablesIn(thisTable)
            local arrayTable, hashTable = {}, {}
            for k, v in pairs(thisTable) do
                if type(k) == "number" and k == math.ceil(k) then
                    arrayTable[k] = v
                else
                    hashTable[k] = v
                end
            end
            return arrayTable, hashTable
        end

New test:

        
        _:test("separateArrayAndHashTablesIn(...) returns correct tables", function()
            --create result flags
            local totalResult, arrayCountRight, arrayResult, hashCountRight, hashResult
            --make test table 
            local tableForKey = {}
            local testTable = {[1] = "one", [2] = "two", [4] = "four", [10] = "ten", 
                ["red"] = "foo1", ["five"] = "foo2", [tableForKey] = "foo3", [3.3] = "three point three"}
            --make verification tables to check results against
            local correctArray = {}
            table.insert(correctArray, "one"); table.insert(correctArray, "two"); correctArray[4] = "four"; correctArray[10] = "ten"
            local correctHash = {[3.3] = "three point three", ["red"] = "foo1", ["five"] = "foo2", [tableForKey] = "foo3"}
            --run the function
            local returnedArray, returnedHash = separateArrayAndHashTablesIn(testTable)
            --inspect counts
            local arrayCounter = 0
            for i, v in pairs(returnedArray) do
                arrayCounter = arrayCounter + 1
            end
            arrayCountRight = arrayCounter == 4
            local hashCounter = 0
            for i, v in pairs(returnedHash) do
                hashCounter = hashCounter + 1
            end
            hashCountRight = hashCounter == 4
            --inspect contents
            if arrayCountRight and hashCountRight then
                arrayResult = true 
                hashResult = true 
                for k, v in pairs(correctArray) do
                    if v ~= returnedArray[k] then arrayResult = false end
                end
                for k, v in pairs(correctHash) do
                    if v ~= returnedHash[k] then hashResult = false end
                end
            end
            --debugging stuff: change to "if false" to turn off
            if true then
                function stringFrom(thisTable)
                     local returnString = ""
                     for k, v in pairs(thisTable) do
                         returnString = returnString.."("..tostring(k).." : "
                         returnString = returnString..tostring(v)..") "
                     end
                     return returnString
                 end
                print("correctArray: "..stringFrom(correctArray))
                print("returnedArray: "..stringFrom(returnedArray))
                print("correctHash: "..stringFrom(correctHash))
                print("returnedHash: "..stringFrom(returnedHash))
                print("arrayCountRight: ", arrayCountRight)
                print("arrayResult: ", arrayResult)
                print("hashCountRight: ", hashCountRight)
                print("hashResult: ", hashResult)
            end
            --overall result is AND combination of all results
            totalResult = arrayCountRight and arrayResult and hashCountRight and hashResult
            _:expect(totalResult).is(true)
        end)

RonJeffries · August 5, 2021, 10:32pm

i think it splits the, but i don’t think you’re guaranteed an output you could use ipairs on. what if it just had 1 and 39?

UberGoober · August 6, 2021, 7:54am

@RonJeffries I think the output would be identical to a table made like this:


      thisTable = {}
      thisTable[1] = “Foo”
      thisTable[39] = “Bar”

…and no, I don’t think an ipairs iteration would reach all the numerical indexes in it, but I think that’s nothing to do with my code, it’s just how lua behaves.

I think lua only guarantees array-type behavior up to the first sequential index that returns nil.

So I think lua would treat the tables { 1 = “foo”, 2 = “bar”, 4 = “etsky” } and { 1 = “foo”, 2 = “bar”, 4 = “etsky”, 39 = “zee” } as identical for the purposes of ipairs and the # length operator—because in both of them, checking the value at index 3 would return nil, and it would stop there.

To be explicit, using # to get each table’s length would return 2 for both, and if you used ipairs to concatenate their strings, both would output only “foobar”.

I think table.unpack() also gives unexpected results when there are index gaps, and possibly a few other language features similarly get tripped up by gaps.

But none of those problems are either solved or made worse by this code here, so I don’t think my code is actually doing anything wrong, unless I’m misunderstanding your point—am I?

RonJeffries · August 6, 2021, 1:00pm

i think your code does put all numeric keys that evaluate to integers into one table, and the others into the other table. i’m not sure what it does with 10.0 or other integer floats: i suspect it puts them with the integers.

aside from “this is interesting”, i don’t get it. if we throw a table at it that we have just lying around … what use are the output tables? i don’t see what it’s good for.

but i think it probably does what you think it does.

RonJeffries · August 6, 2021, 3:28pm

interesting fact: t[1.0] == t[1]. but run this:

function setup()
    t={}
    t[1] = "one"
    t[1.0] = "onedotoh"
    t[15432.0/15432.0] = "big/big"
    t[.0987/0.0987] = "div/div"
    t[(1.0/100.0)*100]="div/times"
    x = 0.0
    for i=0, 99, 1 do
        x = x + 0.01
    end
    t[x]="hund"
    for k,v in pairs(t) do
        print(k,v)
    end
end

UberGoober · August 6, 2021, 3:32pm

@RonJeffries well as I noted, I think, it’s for testing.

I have a table that I’m using both the array parts and the hash parts of, and they reference each other.

To inspect the table, to be sure it’s doing what I think it’s doing, I think it’s going to be much simpler to be able to split it up when I need to, because then I can run pairs() on each table individually, and I can be sure I’m inspecting only the part I want to test.

RonJeffries · August 6, 2021, 4:26pm

this is not advice, but based on my small understanding of what you’re doing, i might:

Create a small class with two collections;
Just use two separate collections.

If i find that i have something together and then sometimes want it apart, i treat it as a clue that either it needs to be two things, or one thing with smart behavior, or one of the two things is something i shouldn’t do, at least not that way.

UberGoober · August 6, 2021, 5:55pm

@RonJeffries You might be right.

I was partly doing it this way because every table in the array holds one or more elements from the hash, and every hash element points to the table it’s in.

And I was partly doing it this way just because lua can do it, and it’s a unique language feature that seems like it was intended to be used for this sort of thing.

RonJeffries · August 6, 2021, 10:24pm

yes, it’s fun just to see what we can make it do

UberGoober · August 7, 2021, 7:06pm

@RonJeffries

I expanded your test there a bit to see what was going on. Super weird.


    function weirdness()
        function printTable(t, description)
            print(description..":")
            for k,v in pairs(t) do
                print(k,v)
            end
        end
        t={}
        t[1] = "one"
        t[1.0] = "onedotoh"
        printTable(t, "starting")
        t[15432.0/15432.0] = "big/big"
        printTable(t, "big/big")
        t[.0987/0.0987] = "div/div"
        printTable(t, "div/div")
        t[(1.0/100.0)*100]="div/times"
        printTable(t, "div/times")
        x = 0.0
        for i=0, 99, 1 do
            x = x + 0.01
        end
        t[x]="hund"
        printTable(t, "built by iteration")
    end

RonJeffries · August 7, 2021, 9:21pm

what’s going on is that the sum of 100 0.01s is not 1.0. it just rounds to 1,0 for print. if you print x-1, you get 6.6613381477509e-16

this is why even 64bit floats aren’t good for money calcs.

UberGoober · August 7, 2021, 11:41pm

@RonJeffries Argh. There is a huge thread somewhere on here where I kind of ranted about this kind of thing. Not the practice itself—representing numbers one way when they’re stored a different way—but the fact that it’s not transparent at all.

RonJeffries · August 8, 2021, 8:43pm

it’s not a problem with the language, it’s an inherent part of binary (floating point) arithmetic. 1/10 decimal is an infinite repeating fraction in binary. just as no finite decimal word size can represent 1/3, 0.33333333…, binary can’t do 1/10. it’s just a thing, one of many, that programmers need to know.