Saving the state of default arguments
Anatoliy057 opened this issue ยท 17 comments
Default arguments not reinitialized
proc _test(@test = array()) {
@test[] = 1
return(@test)
}
for(@i = 0, @i < 3, @i++) {
console(_test())
}
{1}
{1, 1}
{1, 1, 1}
This comes with the question what the default behavior should be, and what should be allowed as a default argument. The main thing to consider language-design-wise is whether dynamic default argument values should be allowed or not (e.g. a read() or some variable). I think it would be reasonable to require default argument values to be static, having as a side effect that they can be evaluated in compile time (through optimization). In this case, non-primitives should be deep-cloned (prefered) or re-evaluated (not prefered) to prevent this bug where a default value can be altered by reference.
When thinking about supporting dynamic default values, then this bug report is likely intended behavior, or non-primitive default arguments should always be deep cloned before assigning them to the function parameter. Consider that in that case, the following is the case:
@b = array();
proc _p(@a = @b) {
ref_equals(@a, @b) // false.
}
Which in my opinion feels a bit weird as to what you'd expect. Not allowing dynamic default arguments solves that entirely.
To help inform this discussion, I've actually used this to store useful maps inside proc definitions. I do not modify these maps in the proc, but because they're not deep cloned it's a faster way of doing this because it only needs to be evaluated once when the proc is defined instead of per call. I would modify my strategy if this was changed.
Incidentally, that array in @b
is currently only shallow cloned in that example code.
I ran into this when I was testing a procedure like:
Resource proc _sub_log(array @params, Resource @builder = res_create_resource('STRING_BUILDER')) {
//...
return(@builder)
}
As a result, the message was simply duplicated several times.
@Pieter12345 As I understood from your words, static default arguments are primitives and procedures of ms itself, like array (), e.t.c.
According to the procedure's behavior, there are no external variables in its scope except for the arguments passed. Would also like the procedure's behavior to depend only on the procedure itself and the arguments passed. Although it can be said that @b the passed argument is the default and all of the above statements are true, but as for me, dynamic default arguments makes statements less strict.
In short, I'm against dynamic arguments, because is weird)
@PseudoKnight Although, if you consider the default dynamic arguments as a temporary measure for the absence of OOP, then it makes sense. But here is another question: whether to make a temporary measure, the need for which will disappear after the introduction of OOP.
Anyway, I think deep cloning is better
Here are my thoughts:
In general, I expect the following:
proc _a(@a = whatever()) {}
to be equivalent to:
proc _a(@a) {
if(unset(@a)) { // where unset is a magic function that detects if the input was set (as opposed to null or something)
@a = whatever();
}
}
Further, even with OOP, I would still want this behavior, because overloads in java that just do things like
void function() {
// convenience implementation
function(new Object());
}
void function(Object o) {
// Authoritative implementation
}
can quickly get out of hand the more variables you have. Further, with named parameter passing, i.e. _callMyProc(a: whatever1(), b: whatever2())
, it becomes much more desirable to be able to continue using optional parameters, and not require so many overloads.
So, moral of the story in regards to this, it's NOT a stopgap measure, and will continue to prove useful long term.
Having said that, the original point here is still worth addressing, which is the dynamic nature of the parameters. When considered in the context of my original code snippet, I'm not sure what the downsides of that would be. I think actually, the discussion shouldn't be framed in terms of optimizations anyways, because consider the case where the optional code has side effects, even if it is final, it would still run once.
Anyways, I'm open to being persuaded otherwise, but the current implementation was done out of convenience, not out of a particular design choice (unfortunately), and so I'm not tied to it. Had I originally made a conscious choice, I think I would have preferred allowing dynamic values, where the optional code is evaluated lazily, but each time it's needed. And side effects would be managed by conceptually (and maybe practically also) "inlining" the optional code as illustrated at the top. The only caveat would be that the code in the default would be limited to a single statement, but that's more due to enforcing a code readability standard, rather than some practical implementation reasons.
I would think that with resources you wouldn't want to create a default one in the arguments anyway. With arrays it's debatable.
I kind of like dynamic defaults (and I kind of wish they weren't shallow cloned). It'd be a nice way to somewhat simulate object methods so I don't have to treat every proc like a static method and pass in the array object. (still possible with shallow clones, but you have to consider every parent array value as final) I get the argument against it, though.
That brings up an extra difficulty to handle. If a compile-time evaluated function results in a non-primitive, should that result in the same reference being passed into all proc calls for some proc when used as default argument value? Maybe this is a problem for such a new comp_execute()
though, and should that make sure to only return immutable values, or have this behavior well-described and leave it to the user to (ab)use it. As that function doesn't exist yet, I believe this issue is out of scope here.
@PseudoKnight Ah, I see. You could use import/export, and pass in some proc that either creates the map for you or returns the already created map. Then the main proc would still call that other proc every time, but a cached map would be returned, rather than having to create it every time.
As for usage of variables in default arguments, that would indeed be a problem. When lazily evaluating these default arguments, those variables are no longer present. Looking at LadyCailins first code example, variables should not even be usable in default proc arguments. I'm not sure if that should be supported or not. It makes sense to have the same scoping rules for default proc arguments as for the proc body itself, which could also mean that both are able to use variables declared in the outer scope.
comp_execute is a straightforward thing to implement, and can be done with little effort, and would be in general a replacement for your use case (and indeed a better replacement, since it would run at compile time, and then be able to be used as part of the general optimization). See #1238.
Anyways, give this some thinks, and then if we can't come up with any cons for the dynamic defaults, I'm fine with the implementation being changed.
Hmm, yes, variable usages. Here's how I envision it working, let me know if you have a different opinion:
Consider:
@b = array();
proc _myProc(@a = @b) {}
@b = array();
proc _myProc(@a) {
if(unset(@a)) {
@a = pull_from_parent_scope('@b')
}
}
Looking at the implementation, we should be able to scan for variable usage in the optional parse trees, and make sure that when the proc is evaluated, those values are passed in through a side channel to the proc to use if necessary. There's only one scope that needs to be scanned, so it's not a particularly difficult implementation or intensive process.
Keeping in mind that default arguments can be complex and contain closures, procs, binds, ... with their own scoping rules, it is not straight-forward to select which variables should be pre-filled-in / linked from the parent scope. Cloning the variable list from the environment at every proc definition is an option, but this might be more expensive than desired.
Hmm, that's true. However, I think changing that behavior is too much, so we should support that anyhow. It would be nice to keep environment cloning to a minimum, but I'd rather do that than change the functionality at this time. Another thing to consider is what to do with assignments in the optional code? @a = assign(@b, whatever())
. Perhaps disallow that? As I stated, I'm ok with restricting this to single statements.
If MS will eventually become aware of statements and expressions (in compile time would be sufficient), then only allowing an expression there would make sense. However, assign()
currently is an expression and so are closure()
and bind()
. So I don't think that that'll solve the problem here. I'm not sure what else you mean by 'single statements' here, considering that concat(...)
can basically contain anything.
Yes, that's true, so that may not be possible then. Either now, or at any point in the future, perhaps. But that still leave the question then. I guess cloning the environment is the only bullet proof way to do this. Perhaps we can write some optimization code in there so that we don't do the clone in the typical case (i.e. where we can determine that there are no variables used for sure) but then in the cases where we know there is variable usage (or we can't determine), then we do the cloning just in case.
If we want to make this work like the first code example from @LadyCailin, which I think is reasonable, then what we should do is store the optimized AST of the default argument expression and evaluate it every time a procedure is called (rather than storing the result of one evaluation). This means that references will never be the same, so @PseudoKnight, your usage of the default value as a map would break (but you can use import/export with a few more lines of code to achieve this anyways).
Regarding usefulness in terms of OOP programming: You can indeed use default arguments by reference to store proc-specific static values/fields, but I believe that this is a feature that more belongs in the Objects side of MS, where it doesn't cause undesired side effects in intended behavior.
My default maps wouldn't break, they just wouldn't be as fast since they would have to be created every time. (What I really want is something like an immutable array that can be stored in a script, maybe via that comp_execute() function we talked about. Just fast access to configuration data would be nice.) Lazily evaluated defaults sounds good to me, though I'd have to think about it more, particularly about variable defaults as in your example, Pieter.
What we can do is scan the AST and keep a list of variable names that are used within there. That can function as a filter that does help, but isn't complete. Considering that this becomes part of optimization, SA is done at that point and could help here. Scanning the AST for variables would still be required, but SA can tell to which declarations those variables resolve (using StaticAnalysis.getTermScope(ivarASTNode).getDeclarations(Namespace.IVARIABLE, ivarName)
). Using Target compares, it can know wether these declarations fall within or outside of the default argument expression. There's one catch here, being that ivariables that are used as parameter declarations should not be detected as ivariable references. The most logical way to do that in my opinion is by parsing/rewriting those to something like proc(param(cclasstype, 'name'[, valExpression]), ...)
, which also allows handling parameters with and without default arguments the same. But that might quickly expand to more fundamental changes.
While this is a step in the right direction, I believe that we'd still have to deep clone mutable pass-by-reference variables in the variable list to ensure that ivariables in default arguments are not affected by any code below there. I still have mixed feelings about this.
We barely use defaults in the proc declaration (a few old exceptions have true/false/null or a static integer/string value as default), we usually do default argument "checks" in the code of the proc using falsey like:
proc _test(@player, @a, @b) {
@player = @player ||| player();
@a = @a ||| 'whatever';
@b = @b ||| array();
...
}
In the past we did something like:
proc _test(@player, @a, @b) {
if(!is_null(@player)) @player = player();
if(!is_null(@a)) @a = 'whatever';
if(!is_null(@b)) @b = array();
...
}
If we get the default values from a database or file then we usually "cache" it using import/export so it doesn't have to do it every proc call.
While the behavior described in the issue doesn't affect our current code, we ran into that issue early on and it pushed us to writing the code the way we do now.
I personally don't mind sticking to the way we do default args now (|||
). But the behavior described in the issue should be documented somewhere.
Luke feels that it makes the most logical sense
proc _a(@a = whatever()) {}
to be equivalent to:
proc _a(@a) {
if(unset(@a)) { // where unset is a magic function that detects if the input was set (as opposed to null or something)
@a = whatever();
}
}