-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Base Modelica package is self-contained wrt lookup #3550
base: MCP/0031
Are you sure you want to change the base?
Conversation
Quoting email conversation:
This is actually a good point, we should leverage the partitioning of the Base Modelica namespace. I'll update accordingly. |
When I think about it slightly more, I am afraid we can't get rid of global name lookup. Consider this full Modelica model:
How would you express this function in Base Modelica without having both non-global and global name lookup? |
What actually could be useful in Base Modelica would be to say that global name lookup means lookup inside the Base Modelica package. That is, global name lookup would not be used for built-in names (no need thanks to namespace partitioning), and we would avoid the ugly need to always begin global names with the mandatory name of the package itself:
|
Absolutely. I would say that this is essential, not just useful. The two main points of generating a BaseModelica model from a Modelica models are:
If the original Modelica model gets some stuff from the global namespace (which nobody actually does, but that's another story), this stuff should be included in the flat BaseModelica output, and then it's no longer in the global namespace, but in a separate, local namespace. From this point of view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per my comments, I would actually forbid to use global namespace component references such as .x.y.z
. The leading dot is providing no useful information, so there is really no point doing that.
I suppose you didn't consider my #3550 (comment)? |
Sorry, I missed that. From a modeller's point of view, using the same name for a model and for a record contained in that model seems to me a bad modelling practice. The record is a part of the model, so from a logical point of view it shouldn't have the same name. As Phil Karlton famously said: "There are only two hard things in Computer Science: cache invalidation and naming things."; for sure one shouldn't try to make things even worse, with obfuscation practices such as naming everything To make a more concrete example, if the model is a From a compiler's perspective, the leading dot would solve all ambiguities, but from a human modeller's perspective, writing a model where So, to me the fact that your example cannot be rendered in BaseModelica without absolute pathnames is really no big deal. But of course it is allowed by the language specification, so we have to deal with it somehow. At the end of the day, in 20+ years of Modelica modelling practice, I've never used an absolute Modelica pathname myself, nor I have ever seen anybody else do that. As far as I am concerned, we could apply my beloved Ockham's razor, deprecate this concept, and eventually get rid of it in the future. But I understand this could raise a lot of completely unnecessary controversy, so I'm not sure I wanto to go down that rabbit hole. Summing up:
All that being said, maybe the best solution is to only put absolute pathnames in BaseModelica if the original model also uses them. @henrikt-ma what do you think? |
I am not sure that the existence of fully qualified names in full Modelica is of primary interest here. I would not be surprised if it is possible to come up with a full Modelica model without any fully qualified names, that can't be translated into Base Modelica. I think the source of the problem is that Base Modelica is (currently) inheriting the full Modelica lookup rules, even though we are imagining Base Modelica code to have a very different sort of hierarchical structure compared to full Modelica. That we use a name mapping to allow reconstructing the original full Modelica component hierarchy is also part of the problem, as this means that the Base Modelica component names cannot be chosen freely. Since fully qualified names offer a way to make scope independent references, they are a powerful concept for automatically generated code, but we should also recognize our lack of evidence that fully qualified names are sufficient to obtain correct lookup in any lowered full Modelica model. It would be wonderful if we could find a computer scientist with the time and interest to prove it! Just a thought: Since function input/output component names are not part of the result variable hierarchy, what if one would use automatically generated unique names for all function inputs and outputs as a way to reduce risk of shadowing conflicts? Would it still be possible to find a variation of my example where a fully qualified name is needed for correct lookup in the lowered model? |
Sure
From what I understand, it will only be a strict subset of those found in full Modelica. For example, we don't have to worry about lookup of classes defined by inheritance, nor lookup of inner/outer stuff. Do I miss something?
To me this is a hard requirement. Base Modelica models should not be obfuscated: every component in there should be clearly recognizable as coming from the original structured Modelica model.
I am surprised by this statement. How is it ever possible? |
@henrikt-ma, another thought on this from my side. The main source of confusion here is that Modelica has all these very complicated relative pathname lookup rules: enclosing scopes, inner/outer, encapsulated packages, lookup into parent classes, etc. And rightly so, because they are essential to build modular, flexible, reusable models and packages. However, there is really no need of that in Base Modelica. If you are considering one specific, atomic, flattened system model, all pathnames should be (at least conceptually) absolute. They are just fully qualified unique names. There is no point having relative pathnames in Base Modelica models, nor having complicated lookup rules to handle them. Except perhaps to have a slightly more compact .bmo files, but I think this shouldn't really be a concern for us, als LZW compression algorithms can take care of that easily. A brute-force solution based on this consideration is to require that all pathnames in Base Modelica are absolute. That would solve all shadowing issues at the root, because you'd always start the lookup from the global scope of the flattened model. And, it would also simplify look-up rules a lot - basically the only look-up rule would be, always start from the top. The only drawback of this approach is that all names will always start with What if we skipped this prefix from the actual Base Modelica output, but then implicitly add it when compiling the Base Modelica code? I think this would solve all of our problems. So, the lookup rules of Base Modelica could be something like this (excuse the possibly imprecise language, I'm not a compiler developer myself):
@henrikt-ma what do you think? |
I am not sure it is that easy, and I am neither sure we'd like the look of the resulting Base Modelica code. For example, consider lowering this Modelica model:
If I understand your proposal correctly, we'd get this:
In addition to looking extremely cumbersome for humans to read and write, I don't think we'd be following full Modelica lookup rules anyway. At least, this is not considered a valid reference to
|
I don't know, but flattening out hierarchy is a non-trivial operation where lookup contexts in the lowered model are quite different from the original full Modelica. Could you please outline why you find it so obvious that it would require a fully qualified name in the full Modelica model to end up in trouble? |
@henrikt-ma you are right, I should have worked out a few examples before posting it. I have to think more about it. Before we continue the discussion, though, I need some clarifications. I understand that, earlier on, the idea of Base Modelica was to have everything contained in a flat model. Then, the design was modified to have a package that contains the flat model plus function, record, and package constants definitions at the same level as the flat model.
Thanks! |
Absolutely. I guess I severly understimated the cardinality of the zoo of possible cases we could have here 😅
My argument is that there are well-defined (albeit very complicated) look-up rules in Modelica, including relative and absolute pathnames, that uniquely indentify which class every identifier points to. So, when outputting Base Modelica code, you just need to give those classes unique names. This may be a non-trivial problem to solve, but a solution exists for sure. Of course this may require to alter the mangled record (and function?) class names to ensure they are unique, e.g., by adding some unique suffixes, so as to avoid shadowing. If that is the price to pay, I'm fine with it. |
At the end of the day, I understand that the issue captured by your MWE (I modified it slightly to make it valid) model r
record r
Real x;
end r;
function f
output .r.r r;
algorithm
r := .r.r(1.0);
end f;
r rx = f();
end r; is that there is a lot of potential for shadowing if you use the same name package 'r'
model 'r'
record 'r$rec_1'
Real 'x';
end 'r$rec_1';
function 'f'
output 'r$rec_1' 'r';
algorithm
'r' := 'r$rec_1'(1.0);
end 'f';
'r$rec_1' 'rx' = 'f'();
end 'r';
end 'r'; A dumb implementation of Base Modelica output would add unique suffixes to all record definitions, just in case. A less dumb implementation may only do it when necessary to avoid shadowing. In fact, if we could safely bring function and record definitions out of the model (see my previous comment), this won't be even necessary, because you could have package 'r'
record 'r.r'
Real 'x';
end 'r.r';
function 'r.f'
output 'r.r' 'r';
algorithm
'r' := 'r.r'(1.0);
end 'r.f';
model 'r'
'r.r' 'rx' = 'r.f'();
end 'r';
end 'r'; which has no shadowing thanks to name mangling. Once you make sure you dont' have shadowing, I reckon we could comfortably use relative pathnames in Base Modelica models, and handle them with a subset of the regular Modelica look-up rules, which does not include inheritance, inner/outer, and encapsulated. That would also make it easy to write a Base Modelica compiler if you already have a Modelica compiler. The only issue I see here is if we can define in an unambiguous way what "shadowing" means, so that we can require that legal Base Modelica models do not have any shadowing. But maybe you know how to solve this problem. It might also be that if you always bring the constant, record, and function definitions out of the flat model at the same level of the flat model in the container package (all with the proper mangled names), all ambiguities are resolved automatically, so that unique suffixes are not really required. |
I still don't know if there is a simple solution here, but at least I think we are on the same page now. The idea of making sure shadowing can be avoided is interesting. Maybe the biggest danger in this regard is the common name of the |
I think we could work a bit more on that, maybe @perost can also make some suggestions
Could be, though if we are only using relative pathnames in the Base Modelica model, the package name should never actually show up explicitly, right? |
Do we want to end up in a situation where all tools (that don't want to end up generating Base Modelica with shadowing issues) put all type names in the non-structured part of the namespace which is reserved for tool-generated names? Maybe that's OK, although I had always imagined that structured (single-quoted) names would be used for all types with a direct correspondence to a place in the full Modelica class tree. Anyway, I also came to think of another source of shadowing conflicts, namely the global constants. Consider lowering
Without use of fully qualified names, the Base Modelica would look something like this:
This could be resolved by the use of a fully qualified name for the reference to the global constant. Again, I note that such fully qualified references would be more appealing in Base Modelica if they all didn't have to repeat the name of the one and only top-level package. |
@henrikt-ma if we decide to avoid shadowing by adding unique compiler-generated suffixes, this case is no big deal either: package 'M'
constant Integer 'P.n$const_1' = 5;
model 'M'
Integer 'k' = 'P.n$const_1';
Real 'P.n' = 'k';
end 'M';
end 'M'; To me it is preferrable to fix the (few) instances of shadowing by adding unique compiler-generated suffixes, rather than using awkward solutions that work in all possible corner cases. At the end of the day, the correspondence between the Base Modelica code and the original code (in particular as regards identifiers) is interesting for humans, that will eventually need to understand the meaning of the simulation results. For that, some addes unique suffixes are not big deal. From a compiler point of view, all Base Modelica identifiers are just (unique) labels. My question is: how easy/difficult this is to implement this shadowing avoidance in a compiler? I have no idea, maybe it's trivial, maybe it's an NP-complete problem. Maybe also @perost and @HansOlsson can comment on that? Of course one could also take the more radical view that Base Modelica is not intended to be written by humans, but only to be used as an intermediate machine language. In that case, awkwardness would not be an issue. However, while I agree that nobody should ever be writing Base Modelica code, people may want to read it, e.g. for debugging. So, I'd like to avoid awkwardness as much as possible. |
Sure. Using properly mangled names is crucial for some things, but not all. While I was always prepared that tool generated names might be needed for types and functions, I wasn't ready to accept that global constants were at risk. For example, I thought that if the value of On the other hand, wouldn't it be nice if we could ensure that the package constants actually can use mangled names that will be the same across tools? If fully qualified references can be used to achieve this, why not? |
I would be surprised if it would be hard to implement separate lookup of types/functions and other components. (Base Modelica would not be the first language with such separation.) I think this could be very helpful in resolving shadowing issues, as the components in the |
There is no need for mangled names to be the same when generating Base Modelica from a given Modelica model. BTW, we already know that it is possible to generate very different Base Modelica models from the same Modelica models, depending on whether you inline functions, evaluate constants, evaluate parameters, scalarize arrays, etc. The important point is that the same Base Modelica model is interpreted in the same way by all tools, which should be OK as long as the mangled names are unique. But maybe I did not get your point here. That said, we could also say that Base Modelica should in general use relative pathnames, but switch to absolute ones in case of shadowing. That is probably even better than adding unique suffixes to mangled identifiers. Is that what you meant? |
I find it utterly important that lowering to Base Modelica can be used as a first step of translating a Modelica model. That the amount of scalarization is expected to differ between tools does not imply that there should be any difference in the hierarchies of unmangled names when producing a CSV simulation result or when exporting an FMU with structured naming convention – at least not for the variables within the instance hierarchy of the translated model. Functions, type names, and package constants on the other hand, are not part of this hierarchy but I think Modelica users would highly appreciate if mangled names with obvious interpretation are used as much as possible. However, it is expected that some of the functions, types, and global constants are tool-generated and do not correspond directly to a location in the original class tree, so it is expected that tool-generated names will be used at least in some cases within these categories. |
No. Whether fully qualified names or relative names are used to reference variables does not directly matter for the ability to obtain correct names corresponding to the original full Modelica model's instance hierarchy. In general, however, shadowing conflicts may require things to be named differently. With language features such as relative as well as fully qualified references, or separate lookup for types/functions and local components, the risk of shadowing can be reduced or eliminated without resorting to tool generated names for all types, functions and global constants. |
There are several ways. Just ensuring that all labels are unique would be trivial in Modelica and Base Modelica. However, instead of trying to find the possible exceptions the simplest is to have non-overlapping name-spaces for the generated contents - e.g., start all global variables with a dot (and non-global with a letter):
This avoids having to use global name lookup to go out of the package and then into it again; and we could even remove that case in the PR. It also easy to understand for users, and doesn't require any special language rules - just a sensible implementation. As for a more detailed look at shadowing:
The reason for this is that if you have the following (the record isn't essential):
We cannot know whether |
BTW: I don't understand the use of "atomic" in the name of the PR. |
The meaning of atomic here is that the Base Modelica package is entirely self-contained; there should be no references to anything outside the package. I'll update the title. |
Absolutely. It guarantees that you can do things locally without bothering about global issues. But I don't think it is essential for Base Modelica, which is just a way to express self-contained models.
True, but again this is the case of Modelica. As the title of this ticket implies, the rationale of Base Modelica is not to describe modular components that can be combined with other components, but rather to describe completely self-contained runnable models. The whole point of Base Modelica models is that everything that is needed to run them is inside them, there should be no dependencies on other entities. The logical consequence is that lookup of stuff outside the Base Modelica model should not be allowed, or be irrelevant. So, in the case of Base Modelica, there won't be other packages. |
When I think about our implementation, where the Base Modelica package is stored as part of the normal class tree for convenience, I realize that we will need to be careful about how lookup is implemented. Since we will have to do something special to ensure correct Base Modelica lookup, I don't see any additional implementation cost of saying that fully qualified names are looked up starting inside the Base Modelica package, which would then make fully qualified names in Base Modelica a handy way to safely refer to all functions, types, and global constants. This could be a more digestible alternative to having separate lookup for types and functions, while also providing a clean solution for referencing the global constants. |
True, but my point is that if we consider it irrelevant (but allowed) we can formally have shadowing, but it doesn't matter. Thus we can state the two goals as:
Thus the lookup would be a sub-set of Modelica lookup. |
Addressing what I believe is the obvious one of the two items @casella is adding to the roadmap in #3549. I mean obvious in the sense that I believe we agree on the desired effect, not that it is obvious exactly how to formulate it.