-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identical names for Instances of different types in the same Model. Bug or feature? #781
Comments
@RS-Credentive - Thank you for your question. The code you point to was inserted 3years 7 months ago by a team member no longer with us. I will need to research it with @wendellpiez before addressing it. |
@RS-Credentive awesome question - I wish I had a better answer. The capability to overload names like this, for better or worse, is a feature of Metaschema. Keep in mind that not only can you have different types going by the same names - sometimes a locally defined type (i.e. a 'type in context') can be different (albeit with same name) as a global type. Etc. Were we coming at the same problem again today I might have argued harder for much/many more constraints over naming. But the capabilities offered both by local definitions (which were not in the original Metaschema v0.1) or I'm glad to address this further in a broader Metaschema context, as well. |
BTW take note that renaming the underlying structures while deploying In any case it's something to think about for an improved OSCAL and could probably even be done in the metaschema sources with backward compatibility wrt data. |
Let me share my perspective on this question since I'm approaching it from a different angle. I see documents conforming to Metaschema models as the output of a process that incorporates data sourced from outside the model or produced by processes outside the model.
In these cases, the elements of the Metaschema documents will be built from the bottom up, not the top down. The data is already there, or is coming from somewhere else, and we need to translate it into a structure that can be represented using one of the metaschema encodings. The tools generating the structure will know about a "location", but people writing the tools shouldn't really have to care about fields or flags. It's just a piece of data that is related to another piece of data in a hierarchical structure. If the software composes the hierarchy correctly, a metaschema aware library can validate it and produce the expected output in the approprate encoding. Now consider the case of the "location-uuid". When I construct a "location-uuid" for inclusion in OSCAL metadata, I need to know whether it is a "flag flavored" or "field flavored" uuid. Of course, my application won't care about a flag or field, but presumably, some assemblies will want a "field flavored" uuid, and some fields might want a "flag flavored" uuid. Here are the definitions that got me all worked up, by the way: <define-flag name="location-uuid" as-type="uuid">
<formal-name>Location Universally Unique Identifier Reference</formal-name>
<description>Reference to a location by UUID.</description>
<prop name="value-type" value="identifier-reference"/>
<prop name="identifier-type" value="machine-oriented"/>
<prop name="identifier-scope" value="cross-instance"/>
<constraint>
<index-has-key name="index-metadata-location-uuid">
<key-field target="."/>
</index-has-key>
</constraint>
</define-flag>
<define-field name="location-uuid" as-type="uuid">
<formal-name>Location Universally Unique Identifier Reference</formal-name>
<description>Reference to a location by UUID.</description>
<prop name="value-type" value="identifier-reference"/>
<prop name="identifier-type" value="machine-oriented"/>
<prop name="identifier-scope" value="cross-instance"/>
<constraint>
<index-has-key name="index-metadata-location-uuid" target=".">
<key-field target="."/>
</index-has-key>
</constraint>
</define-field> I don't need to distinguish between the flag-flavor or field-flavor because there's any difference in the data. I need to differentiate because I have to encode them differently if I encode them in XML. This is absolutely not the point of metaschema. A flag should be a flag because you don't need to attach any metadata to it, and a field should be a field because you might need to attach some metadata to it. The distinction between field-flavored and flag-flavored locations doesn't convey any useful information, but tracking the flavor of data will add a lot of complexity to the wrong layers. The fact that there are two different top-level, global instances in the same schema with the same name means that libraries, and thus application authors, will have to keep track of which flavor they've got and which flavor they need. I would propose the following rule, enforced technically if possible but with strong guidance if not: I see this being the same as an import cycle. Technically, we can't design a way to forbid import cycles in metaschema itself, but implementations are required to nope out immediately if one is detected. Similarly, even if we can't design a constraint or rule in the XSD that forbids instances of different types with the same names or instances of the same type and name with different definitions, it should be considered bad style. |
Thanks for putting this on the record @RS-Credentive, you make a good case. I might make rules for a next-generation metaschema even tighter than this ... if it were up to me. As @iMichaela hinted above, the current design bears the marks of the process of evolution that produced it: it is not without flaws, both acknowledged and unknown. (And having them exposed for remediation is better than just suffering with them.) In general I would also offer that the difference between a field and a flag is much more consequential in XML-flavored Metaschema applications than in JSON-flavored applications. Both aesthetically and with respect to affordances (in the model), "element or attribute" is a difference that can make a difference. Indeed as a data modeler, I would offer that (a) if you only care about JSON never XML, and (b) you never have mixed content (i.e. Markdown-y) data values including In any case, at a metaschema-redesign table, I would see you and raise you a bunch -- no one has asked but I would actually like to think about doing away with all local definitions and all overloading, with only For the record, a rule such as what you propose could indeed be enforced with the help of a query over a set of metaschema documents (however defined). However, I am not sure I would even go that far, or not yet. Is there a concrete recommendation that could be made for this case only? For example to rename the field to say In any case the problem goes a little beyond discerning and enforcing the right kinds of consistency. We need a development model (dare I say a spiral) capable of implementing and demonstrating these ideas through actual testing. Nonetheless the input is valuable and demonstrates the need. |
@wendellpiez, thanks for the thoughtful reply. Regarding my concrete recommendation for the case at hand, it looks like "location-uuid" is only referenced as a field in the particular specification we're talking about, so it could be as simple as eliminating the redundant "flag flavored" definition of location-uuid. I'll make that recommendation over on the OSCAL issue. It's possible that a metaschama could be as expressive without fields, and maybe that simplification would be good, but I think one of the benefits of Metaschema is its encoding independence, so I think the importance of XML vs JSON/YAML shouldn't really be a factor. If XML is really, really important to your application, you should use one of the XML schema specifications. Am I mistaken to think that the question of how to model the data should be distinct from the question of encoding the representation of the data? If there's truly no difference between a field and an assembly from a pure data modelling perspective, then I guess Metaschema 2 should just pick one of them and go with it :) |
@RS-Credentive thanks, perfect. As to larger question, I'd probably agree with you except for the perturbing fact that there is a great deal of information in the world (let's not call it 'data') that is not yet encoded, not yet machine-readable, not yet processable using any model. That sounds grand, but it's very mundane. We are not starting with RDBMS but with something closer to PDF (but not even). (Once upon a time I wrote a paper on this topic, here: https://balisage.net/Proceedings/vol21/html/Piez01/BalisageVol21-Piez01.html - for reading while waiting for a bus?) It's not JSON's fault that it has no native constructs for what's called 'mixed content' - to the point that in actual systems (such as the one where I'm typing) we end up folding in Markdown (urp!) to achieve some meager machine readability. In Metaschema, the entire modeling problem there is tucked away into the Accordingly, it leaves you and devil's-advocate-me to address the question: what about the This could undoubtedly be designed differently, the point here being not that we have found the best balance, but rather that a balance must be struck, and that's where we put it. I actually think fields are useful in other ways as well, conceptually. And without a better solution for full-text data capture of catalogs, the markup-line and markup-multiline dedicated datatypes have stood up pretty well. (FWIW to consider them datatypes instead of a special kind of node was one of the many important contributions of @david-waltermire.) As for the advice that if XML is so important, we should just be using an XML technology ... you sound like me now. :-) From an XML point of view, Metaschema is really just a set of rules to follow for keeping our hands tied behind our backs, since the JSON people don't have hands, we can only play football with them (no hands), not handball (or 'football' with throws, catches and grappling). The 'hands' in this analogy is a concept of mixed content, whether mixed element content such as (HTML) |
Please see this issue from the OSCAL Repo
The OSCAL metadata metaschema defines a field named "location-uuid" and a flag named "location-uuid". I get that a flag can reference an externally defined flag, and a field can reference an externally defined field, so the names could be the same, but introduces a good amount of complexity because of the need to introduce namespaces to disambiguate between globals of different types with the same name. Is this kind of name conflict possibility intentional? Is it possible to require unique names for instances in the same model? At least for top-level globals?
The text was updated successfully, but these errors were encountered: