Arrays, Pointers, and Copying

Greetings,

So my current project is progressing along well, no bugs, 100% achieving what I want. There are no bugs, so technically I am not here for help.

However as I was writing some code something came into my head that I would like clarified to me …

I am taking some data out of a source table, and reformatting it to reduce some overheads in looking up tile set IDs during actual execution - so this is all done during setup (level loading etc).

However, LUA uses pointers a lot and I was wondering, if I destroyed the source table, what would happen?

So the example

AClass = class()

function AClass:init(layer, parent)                     <-- source table is passed in layer
    
    -- a back reference to the parent is useful
    self.parent = parent
    
    self.type = layer.type                     <-- am I making a copy of this field
    self.name = layer.name                 <-- and this one? Or am I pointing into original layer table?

    ....

    self.data = {}        
    for i=1,#layer.data do
        local keyID = self.parent.tilesets:translateGID(layer.data[i])
        table.insert( self.data, { key=keyID, data=layer.data[i] } )  <-- does data still point to original table?
    end

end

Hopefully my question is clear enough. I am pretty sure I am copying the information, however some parts of the source layer are not simple strings or numbers but rather tables themselves.

So layer.data[i] may be a table - am I making a unique copy of that table? Or am I pointing at that table?

Thanks In Advance,
D

Just to clarify, Lua does not use pointers (not in the traditional C/C++ sense) but instead what are passed are references.

When you talk about destroying the source table, you don’t normally “destroy” tables (again using destroy as in C++ sense) as you don’t actually manage memory at all. This is all handled by the Lua garbage collector - when the Lua runtime recognises that a resource no longer has any references to it then it marks it as garbage and then at some point in the future when the GC is run then it will be removed.

In your example above if layer.type or layer.name is a primitive value then it will be copied, if it references a more complex item then the reference is copied (a bit like copying the value of a C pointer) so that afterwards both variables actually refer to the same resource (and the resource has it’s reference count incremented to match).

This is of particular relevance (and a source of confusion) when it comes to duplicating tables / table entries and if you’re making a shallow copy (ie copying references) or a deep copy (actually duplicating the initial data).

Hope this helps.

@TechDojo yes that does help, a lot.

So further along those lines, if I have complex data types within my layer source, how would I go about making a deep copy? Can you point me to some samples?

I really do not think it is a big deal since it is just references and thus not really taking up space, however I like a “clean house”.

D

@fly.ing.fox

As I understand it, Lua only stores strings and numbers directly in variables, so vectors, tables, class objects etc are passed by pointer.

So to duplicate a table, you need to make a deep copy.

tbl[1]=3
tbl[2]=vec2(1,3)
tbl[3]={1,2,3}

A=tbl[1] --3 is copied
B=tbl[2] --B is a pointer to the vec in tbl[2] rather than to tbl[2] itself
C=tbl[3] --C is a pointer to the table in tbl[3]
D=tbl --D is a pointer to the original tbl

The easiest way to check how it works is to make a “copy” of something, change the value of the copy, then check the value of the original.

If layer.type or layer.name are themselves tables within the layer table ( eg layer = { type = {1,2,3}, name = {}}), then you’re just creating references. If, however they are variables layer = {type = 1, name = "hi"}, then you are creating independent copies of them. Same with layer.data[i]. Also, userdata (eg vec2s) are treated as tables, unless you perform an operation on them:

a = vec2(50,50)
b = a -- just a reference to a, not an independent copy
c = a * 2 --operation creates a new, independent vec2 object

The way I think about it is that a = b always makes a duplicate of what is in b, but if b “contains” a table, vec2’s, or anything more complex than a number or a string, then b actually only stores a pointer, and so a gets a copy of the pointer.

Everything in lua is a reference.

When you say a = "hello" then a is a reference to the memory location of the string "hello". If you also say b = "hello" then a and b are pointing to exactly the same item in memory.

Most of the time, this can be safely ignored thanks to two lua facts:

  1. In an assignment, the right-hand side is traversed until a memory reference remains. Thus in c = b, c also becomes a reference to the string "hello" and no connection remains between b and c. This includes “implicit” assignments such as happens when data is passed into a function.

  2. Lua’s basic data types are immutable. So with the above, then if you try to modify the string that b holds, you’ll find that you cannot actually modify it. All you can do is create a new string and set b to reference it. This is why there is no increment operator in Lua. You can’t increment a number, all you can do is create a new one out of the old one and reassign the reference: d = d + 1.

But tables are mutable, and so when you have two references to the same table then changing one affects the other. Only it doesn’t since there’s only one actual table and both point to it.

This is the same wherever the references are in the code. You can have a “global” table which you pass into a function which saves a “local” reference to it. These are the same table, so altering the “local” one has global consequences.

(Note that Codea’s additional data types behave like tables in that they aren’t immutable: p = vec2(1,0) q = p q.x = 3 will change the object that p references and thus change what p looks like.)

Since everything is a reference, deletion works slightly differently to how you might expect. As others have remarked, you can’t delete something. All you can do is remove a reference to it. Once all the references have gone, it is garbage collected. But a = "hello" b = a a = nil doesn’t destroy the string that b refers to because b is still referencing it. It’s just no longer accessible via a.

Once you get your head round this, a lot of stuff suddenly makes sense. Such as why you need to do a deep copy of a table if you want to be able to modify it in one circumstance and not have that affect the original. And you learn to be very wary of the word “copy”!

Incidentally, Codea’s additional data types, such as vec2, behave very like tables. So if you get a deep copy method of the internet, it will need to be modified to clone Codea’s extra stuff otherwise it won’t be an independent copy.