cmodule 0.1.1: sneak peek at a dynamic module loading system designed for Codea

Andrew_Stacey · May 17, 2013, 11:00am

@toadkick I haven’t downloaded 0.0.8 yet - will do so soon - but just wanted to record that I’ve now successfully converted my Library to cmodule. This was my “Can I use it?” test so the answer is a resounding “Yes”.

Interestingly, it’s shown up some dependencies in my code that I didn’t know were there.

toadkick · May 17, 2013, 11:34am

@Andrew_Stacey: great! Thank you for taking the time to give it a go

Your caching idea actually gave me an idea for a possible better way to implement the nocache/“only execute once” feature that I’m going to test out on my flight here soon. If it works I’ll check it in when the plane lands.

toadkick · May 17, 2013, 5:32pm

@Andrew_Stacey: One last update today, and that will probably be it for awhile unless there are some other bugs or issues to fix.

v0.0.9 gets rid of cmodule.null; it’s not necessary now since cmodule now keeps all if it’s modules alive unless explicitly unloaded.

Also, I changed the way you specify that your module should not be cached; returning a value from the module to indicate that was a poor implementation IMO, because it’s possible you might actually want to return something from your module, even if you aren’t caching it. So, now, to disable caching for your module, put this line of code somewhere in your module file:

_M[cmodule.cache] = false

cmodule.cache is a unique key, so there is no chance of it clashing with your own variables, and when cimport() has finished loading the module it removes it, so your _M table stays clean.

Andrew_Stacey · May 17, 2013, 5:44pm

@toadkick Great!

One small bug: in cmodule.loaded then if a project isn’t specified it uses the main project not the project containing the module (so not the same as cimport does).

And a feature request: I’d like to be able to specify a search path for modules so that I don’t have to always specify the project. Most of my imports are from the same library.

toadkick · May 17, 2013, 6:08pm

@Andrew_Stacey: that’s not a bug. Sorry, I thought I put a comment over cmodule.loaded that indicated that the currently running project would be used if the project was not specified…I’ll make sure that’s specified in the next update. All of the functions contained in the cmodule table behave this way (the global cimport/cload, i.e. _G.cimport and _G.cload, which is the one you are calling if you call cimport/cload in Main, are the same functions as cmodule.import and cmodule.load).

The cimport/cload functions are actually the exceptions (as I explained in a post above)…instead of using the currently running project like the default cmodule.* functions, modules are provided with their own versions of cimport/cload that load from their owning project by default, so that if you decide you want to duplicate a project, you don’t have to update all of your project’s imports.

I’ll give some thought to module search paths, but I’m not sold on the idea yet. It’s possible that some sort of path search system could add significant overhead and slow down module loading, and honestly I’m not sure if saving on typing the project name on external imports is enough of an annoyance to warrant the change. Not to mention that it makes the code much less clear when you don’t know which projects your imports are coming from. Let me ponder on the impact to the system and possible implementations and I’ll get back to you on that one.

Andrew_Stacey · May 18, 2013, 4:49am

@toadkick You did, I just didn’t see it until after I’d complained!

I’d like a variant of this, please, that works as I described. My use of this is in a library module which wants to know which of its fellow modules have been loaded. As I have lots of projects that use the same library, this varies quite considerably so it really is useful. At the moment I have to hard-code the library project name into the code, making it less portable.

Which brings me on to the search path. This is all anout portability. Your cmodule stuff and Briarfox’s autogist will make it much easier to share and organise stuff. But I can’t dictate the project names that others will use for my code so a search path makes this easier to manage.

I’m using a poor man’s search path at the moment using a string for my library project name, but then I’ll end up with difficulties if one project includes modules from another project that includes them from a third (yes, I do do this) since the primary project has to set the string that the middle project uses, breaking the idea that modules are black boxes.

I’m seriously thinking that my project management needs a seroius overhaul and a search path will make it easier for me to experiment.

I might have a go at implementing it myself …

toadkick · May 18, 2013, 9:30am

@Andrew_Stacey

“I’d like a variant of this, please, that works as I described.”

Actually, I’d like one too The challenge is that cmodule doesn’t know which project a module belongs to until you tell it (by calling cimport with the module’s path). Thus, cmodule.loaded has no concept of what project a module belongs to (again, unless you tell it, by specifying the project in the module path passed to cmodule.loaded). What I’d actually like is for all of the cmodule APIs to work the way the module-specific cimport/cload work, but unfortunately since Main is not loaded by cmodule, I have no way to know which project Main belongs to, which is why this inconsistency exists in the first place. So, I said all of that to say this: if I can come up with a way to solve this issue, I will, but it is a tough one.

“Which brings me on to the search path. This is all anout portability. Your cmodule stuff and Briarfox’s autogist will make it much easier to share and organise stuff.”

A search path could help with the issue I mentioned above, but again, I fear it will make loading modules slow. For example, if I have 3 project names in my search path, then whenever an unqualified path is provided to cmodule, then I have to actually attempt to load the files from each of those projects to find out which one owns the module, which I expect will be slow. And, the problem gets worse (and performance characterstics become more unpredictable) the more projects there are in the search path. Maybe I’m wrong though, and in practice it won’t be so slow. I’ll experiment with this.

“But I can’t dictate the project names that others will use for my code so a search path makes this easier to manage.”

Incidentally, cmodule has the same problem, and my “solution” for this was to have to programmer tell cmodule what the running project is by calling the cmodule() function when the program starts. Unfortunately, as you’ve noticed, that solution is not scalable; every dependent library that need this information has to be initialized in the same way.

I do think that better organization might ameliorate a lot of your issues, but I also think your concerns are legitimate, and I will see what I can do to address them. I’m going to be pretty busy for the next couple of weeks so it may take awhile for me to be able to revisit this, however, I am interested in seeing the results of your efforts to implement a search path, so please keep me informed

toadkick · May 18, 2013, 9:40am

@Andrew_Stacey: I don’t know if this will help you with your specific issue, but I just remembered that every module is provided with 2 variables that provide the owning project name and file name: __proj and __file. Additionally, for convenience, a function called __pathto is provided to each module that concatenates the owning project and specified file name into a Codea path. For example, within a module that belongs to the project Foo, __pathto(“bar”) will return “Foo:bar”.

I think this might provide a workaround for your issues. Can you explain in more detail the dependencies and use cases of your library?

toadkick · May 18, 2013, 12:27pm

@Andrew_Stacey: Aha! I think I’ve sorted out a way to make this workable. Tell me what you think of this proposal:

a) cmodule will provide an API for specifying the search path. You’ll pass it a table/variable argument list of projects that you want to be in the search path:


-- set path
cmodule.path{"SomeProject", "SomeOtherProject", "Whatever", "Utilities"}
-- or --
cmodule.path("SomeProject", "SomeOtherProject", "Whatever", "Utilities")

-- get a table listing the projects currently in the path
local cpath = cmodule.path()

This API has the feature (whether this is for better or for worse remains to be seen) of allowing you to modify cmodule’s search path anytime during runtime, since you can get the search path table, modify it, and give it back to cmodule.path(). If that feature is not desirable or provides too much flexibility, I could simply throw an error if more than 1 attempt is made to set the path during program execution.

b) All of the cmodule APIs will continue to do what they do now, which is attempt to load/introspect using the “default” project (the currently running project if you called _G.cimport/cmodule.import, or the owning project if you called cimport from within a module); if the module is not found there, then it will check to see if a search path is defined. If not, it will throw a “module not found” error. If so, cmodule will search the projects in the path in the order they were passed to cmodule.path, and will use the first matching module name it finds. If the module name is not found in any of those projects, it will throw a “module not found” error.

This way, cimport/cload suffers no additional overhead for the common use case, since we only search if the module is not found in the default project, while still allowing more generic/flexible usage of the cmodule library for more special cases.

I’m still a bit concerned that this will make it harder to reason about code that uses this feature, since it will not be obvious simply from reading a module’s code where the modules it depends on are located, but pragmatically speaking this is probably the best solution.

EDIT: this is completely workable. I’m about halfway done implementing it, but I probably won’t get to hack on it anymore today. Hopefully by tomorrow evening or sometime Monday I’ll have cmodule updated to 0.1.0, with the search path feature included.

Andrew_Stacey · May 18, 2013, 3:57pm

Here’s my version: https://gist.github.com/loopspace/5605561

I think your logic in (b) is wrong. I want to be able to override a library module from a project. Suppose I’ve some ideas on how to develop a module. Then I might copy it into a test project to experiment and I want the experimental module to override the original one.

My code has caching so that once a module has been found no more searching has to be done and so I don’t think that there’s any serious added overhead.

However, my code does mean that if you load a module that isn’t in the search path and it loads modules in its same containing project without specifying a project then they won’t be found. I’m not sure if I think this is a problem or not. At the moment I’m leaning not it not being a problem.

toadkick · May 18, 2013, 4:11pm

@Andrew_Stacey: I’ll study at what you’ve got when I get a chance (hopefully tomorrow), but if it is not inefficient then I agree with your point on b).

“However, my code does mean that if you load a module that isn’t in the search path and it loads modules in its same containing project without specifying a project then they won’t be found. I’m not sure if I think this is a problem or not. At the moment I’m leaning not it not being a problem.”

Well, my issue here is that I expect that the common use case for cmodule when writing library code will be heavily skewed toward loading modules from the same containing project. It would be a maintenance hassle to rename/duplicate a project using cmodule (which I do frequently, especially when I don’t yet know what I’m going to end up naming the project). The common API should accommodate this…I spent a lot of time working to specifically address this issue. What I’m thinking is, perhaps providing an alternate API for importing modules would be an acceptable solution; for example a flag passed to cimport that says to give the path priority over the default, or an alternate import function that does this. What do you think?

Andrew_Stacey · May 18, 2013, 6:06pm

Latest version uses the current module’s project as the fallback if all else fails. This needed redoing the override function and revealed an oddity about the ... mechanism: it always takes up at least one argument so f(a,...,b) will always be at least three arguments to f.

toadkick · May 18, 2013, 7:51pm

@Andrew_Stacey: EDIT: ah I see what you mean…I didn’t even realize you could specify additional params after …

toadkick · May 18, 2013, 10:48pm

I just realized why I’ve never seen anyone specify a parameter after …; it’s not usually desireable, because … will only end up evaluating to the first variable argument in the list (or nil if there are no args), discarding the remaining parameters in the list.

I just took a peek at your changes, and honestly I’m not terribly enthusiastic about the API; it feels tricky and a bit confusing. I did notice that you wrapped cmodule.loaded in a closure also to pass to modules, which is okay, but then it begs the question: why not include cexists as well? At this point, we’re almost passing most of the API to each module, which I really wanted to avoid (creating all of those closures isn’t terribly cheap, and replicating that much of the API complicates what I’m thinking is already a kind of confusing interface)

Sorry @Andrew_Stacey, I think I’ll need to rethink some things in order to refactor this to meet your needs, and provide an API that I’d be satisfied with, as well as being minimally invasive performance-wise. Unfortunate nobody else seems very interested (and I feel like your use case is atypical, or at least I only have a vague understanding of your needs) so it’s hard for me to gauge what changes would be more/less useful to others (and still useful to me), and how to prioritize my efforts

Andrew_Stacey · May 19, 2013, 3:10pm

@toadkick Didn’t realise that about the .... However, there are other ways around that problem.

As I said above, I’m not sure that I want this functionality: that cimport "something" in a module knows the current project that it is in and searches within it. Without that, then I can remove the overloading on all of the cimport routines. Indeed, I’m only overloading those because I can’t get at _resolveCodeaPath directly. What I’m thinking of now is that inside a module then thisProject resolves to the project name (with the colon) so I can do cimport (thisProject .. "something") to force loading from the current project.

The way I’m thinking of it is that the majority of modules that I load will be in library projects and I’ll add them to the search list, so cimport "something" will work because the containing project will be in the search list, and if I want to override something then I can do so since it uses the same search path as the main project. This is, in my experience, how library search paths work in other languages (not that I have a lot of experience). Then if I do load a module from some project not in the search path, and it loads some others, then the likelihood is that all of those modules are known (when I wrote them) to be non-library code and so I would ensure that they used the thisProject prefix when loading. In addition, using thisProject makes them portable.

Don’t get despondent! I think it’s a useful system and I prefer it to mine (it’s a little bit slower than mine but I like the segregation that it enforces).

But I wouldn’t worry overmuch about the impact on performance. The vast majority of module loads will be in the setup and so won’t have a knock-on effect in the actual running of the code.

toadkick · May 19, 2013, 3:52pm

@Andrew_Stacey: just to make sure I understand, “thisProject” will hold the name of the currently running project, or the project that owns the module calling cimport? If the former, this is already available to you via cmodule.project(). If the latter, every module is already provided with a variable “__proj” that contains the name of the module’s owning project, and a function “__pathto” that will prefix a module name with the module’s owning project (see a few posts up).

FWIW, I’m not despondent or giving up, in fact quite the opposite, your interest in this has not only revealed several bugs and design flaws but has also forced me to think of things that hadn’t occurred to me. My main issue for at least the next 3 weeks will just be a lack of time to think up and implement solutions that might require some rewiring. At any rate, I still suspect that either a) clearer organization and b) facilities that are already there that you might not be aware of (or both) might help you with accomplish what you are trying to do.

Also, generally speaking I’m not 100% thrilled with the API yet and do want to investigate ways also to make it cleaner and more obvious. Once things are stabilized a bit more, I’ll do an optimization pass.

Andrew_Stacey · May 19, 2013, 5:14pm

@toadkick It was going to be the latter, but when I wrote that then I hadn’t followed through every bit of the code so hadn’t discovered __proj and __pathto. My current modifications now remove the overloading completely.

It is now working how I want and I’m reorganising my library to make it more manageable by splitting it into smaller chunks.

toadkick · May 19, 2013, 5:51pm

@Andrew_Stacey: That’s great news! So, I’d like to solicit some feedback: do you think that the search path functionality I outlined earlier is necessary/desirable, or do you think it provides perhaps too much flexibility? I could go either way. I’m tempted to leave cmodule as it is for now (though I have identified some optimizations I’d like to make), barring perhaps a few small API changes (which will documented thoroughly in the changelog), unless/until I can determine with confidence that it’s a good feature.

And, as you continue using cmodule, please don’t hesitate to continue to give feedback! You’ve already helped make it better in a very short time

toadkick · May 19, 2013, 6:30pm

also: should I continue exporting cimport and cload to the global namespace? I’m thinking that perhaps if cmodule forced you to use cmodule.import and cmodule.load inside of Main (or more specifically, inside of a tab that was not loaded with cmodule), that it would get rid of the inconsistency between how cimport/cload handle defaults depending on where it’s called from, and actually provide a clearer interface. What do you think?

Andrew_Stacey · May 20, 2013, 6:43am

@toadkick I think that some search path functionality would be desirable. I’ve just started reorganising my library since this allows me to easily have things in more sensible library projects without having to import every one every time. A search path facility means that I don’t have to remember which project contains which module.

I also like cimport being in the global namespace. I realise that you don’t like polluting the global namespace, but there’s a point at which that principle crosses the boundary and just starts making everything complicated.

I’ve uploaded my version to github: https://gist.github.com/loopspace/5605561 I’ve removed the overloading completely so now it doesn’t search the containing project for modules unless that project is on the search path, but a module can get round this by using the __pathto or __proj. If I ever use these (which at the moment I doubt I will) then I’d probably rename them to something a bit more friendly.

I’ll see how I get on with cmodule and keep reporting back. I think that my modifications are close enough to your original code that my experiences will be of use to you.