Skip to content

Conversation

msdemlei
Copy link
Collaborator

This is an alternative to PR #8 and was developed in the discussion there.

@mbtaylor
Copy link
Member

mbtaylor commented Oct 1, 2025

Yes, I think this would be a good solution.

As I noted in my comments on #8 there is some scope for older clients getting confused by multiple limit declarations differentiated by an attribute they don't know about. In existing versions of TOPCAT, I think this would result in multiple apparently contradictory entries in the TAP window Max Rows selector (only really apparent if you actually click on the thing, which probably most users don't most of the time). That could look a bit weird to users, but I wouldn't say it's catastrophic, and arguably is more informative than the existing state where you see a single limit that may be invalid for one of sync/async. I can't speak for other clients.

@stvoutsin
Copy link

I think this approach looks great and does indeed seem to work for to the various use-cases.
Just for completeness I was wondering if we want to clarify a couple edge cases:

Should we clarify how clients should choose if both a general declaration and a mode-specific declaration exist for the same element type?

For example say that the service declares:

<executionDuration>
  <default>30</default>
  <hard>120</hard>
</executionDuration>

<executionDuration forMode="sync">
  <default>60</default>
</executionDuration>

<executionDuration forMode="async">
  <default>3600</default>
  <hard>86400</hard>
</executionDuration>

What would be the default for sync? I'm assuming it sould it be 60, but is this clear what the clients should do based on the document? And is the hard-limit for sync 120 or undefined? In other words should clients scan all elements and merge them, or use first-match?

The second edge case is that the schema allows multiple declarations for the same mode.
For example I think this would be valid:

<outputLimit forMode="async">
  <default unit="row">100000</default>
  <hard unit="row">1000000</hard>
</outputLimit>

<outputLimit forMode="async">
  <default unit="row">200000</default>
  <hard unit="row">5000000</hard>
</outputLimit>

Should we outline how clients should handle this, and should the document prohibit this somehow?
Since XSD constraints seem difficult here unless there is an easy way I'm not aware of, should we add explicit guidance in the specification:

Something like:

Uniqueness of Mode-Specific Declarations

For each limit type, servers MUST not include more than one element with the same forMode attribute value.

Servers MAY include:

  • At most one element without forMode (the general declaration)
  • At most one element with forMode="sync"
  • At most one element with forMode="async"

If the capabilities document violates this constraint, clients SHOULD/MUST(?) use the first matching element.

@gmantele
Copy link

gmantele commented Oct 2, 2025

For example say that the service declares:

<executionDuration>
  <default>30</default>
  <hard>120</hard>
</executionDuration>

<executionDuration forMode="sync">
  <default>60</default>
</executionDuration>

<executionDuration forMode="async">
  <default>3600</default>
  <hard>86400</hard>
</executionDuration>

What would be the default for sync? I'm assuming it sould it be 60, but is this clear what the clients should do based on the document? And is the hard-limit for sync 120 or undefined? In other words should clients scan all elements and merge them, or use first-match?

My understanding for this example is the following for sync mode: default=60 and max=120.

I agree this corner case should be explicitly described. I'd say that the default block (i.e. the one with no for-mode attribute) applies by default and is overwritten by the most specific one (if any is present). A such example can then be added in the document to be more concrete for readers.

@gmantele
Copy link

gmantele commented Oct 2, 2025

The second edge case is that the schema allows multiple declarations for the same mode. For example I think this would be valid:

<outputLimit forMode="async">
  <default unit="row">100000</default>
  <hard unit="row">1000000</hard>
</outputLimit>

<outputLimit forMode="async">
  <default unit="row">200000</default>
  <hard unit="row">5000000</hard>
</outputLimit>

Should we outline how clients should handle this, and should the document prohibit this somehow? Since XSD constraints seem difficult here unless there is an easy way I'm not aware of, should we add explicit guidance in the specification:

Something like:

Uniqueness of Mode-Specific Declarations

For each limit type, servers MUST not include more than one element with the same forMode attribute value.
Servers MAY include:

  • At most one element without forMode (the general declaration)
  • At most one element with forMode="sync"
  • At most one element with forMode="async"

If the capabilities document violates this constraint, clients SHOULD/MUST(?) use the first matching element.

Here, I'd say there are two approaches:

  1. Allow multiple occurrences for the same mode and take into account only the last one into account (but what happens if one defines the hard limit and not the other?)
  2. As you suggest, explicitly say that at most one occurrence of each mode must be included.

My preference goes for the 2nd option, as @stvoutsin suggests, as the 1st one will be clearly more confusing for everyone. If this one is adopted, may I suggest to reflect this in the XML schema by setting the attribute maxOccurs to 3 instead of unbounded?

@mbtaylor
Copy link
Member

mbtaylor commented Oct 2, 2025

I'd say that the text in the proposed section 2.2 Mode-dependent Declarations covers this in sufficient detail:

Elements without a forMode attribute apply to all modes for which no more specific specification is given. Elements with a forMode attribute override elements without one. That is, to find the applicable declaration for a particular access mode, clients should first look for an element with the current access mode and then for one without forMode. Whatever is found first is the relevant declaration.

Admittedly it doesn't explicitly forbid adding multiple contradictory limits; that could be added but it seems pretty obvious to me how that would work (services shouldn't do it, I don't know why they would, and if clients find such things they'd be within their rights to use whichever one they like).

@gmantele
Copy link

gmantele commented Oct 2, 2025

My bad, I did not read the diff of the .tex part...sorry. Indeed the 1st corner case of @stvoutsin seems to be covered there.

But as you said, not the 2nd one. I'd say it should because of the behavior confusion that it would bring, on both server and client sides.

@mbtaylor
Copy link
Member

mbtaylor commented Oct 2, 2025

I'm happy with the revised text.

@gmantele
Copy link

gmantele commented Oct 2, 2025

It looks good to me too.
I can review and approve this PR whenever @stvoutsin and you are also happy with this version (unless you want to add anything).

@msdemlei
Copy link
Collaborator Author

msdemlei commented Oct 2, 2025 via email

@mbtaylor
Copy link
Member

mbtaylor commented Oct 2, 2025

I'm trying again in commit #7165975 and immediately dislike the language. That this is so complicated is instilling doubts in me whether that's actually what we should do. Hm. Better text solicited, either way.

I think you're being too hard on your text. It looks comprehensible to me.

@stvoutsin
Copy link

This version looks good to me, but I still wonder if there might be ambiguity in the edge case that I mentioned.

The text specifies that

limits given in elements with forMode override limits given in elements without one for the mode specified.

However override could potentially be interpreted as a complete replacement, or a merge/patch.
My interpretation would be a complete replacement because we say "limits given in elements" override rather than "child values in elements", which suggests to me the entire element is the unit of replacement.
However @gmantele interpretation reads it as a merge. So perhaps it would be useful to clarify?

Also one thought is should we change the text structure a bit based on the two distinctive behaviours?
I'm thinking something like this:

Elements without a forMode attribute apply to all modes.
The forMode attribute controls two different behaviors:

Additive declarations: outputFormat, uploadMethod
Mode-specific declarations add to the general set. All declarations
apply, mode-specific ones don't remove or replace general ones.

Override declarations: (limit elements: retentionPeriod, executionDuration, outputLimit, uploadLimit)
Mode-specific declarations completely replace general ones for that mode.
When a mode-specific limit element is found, only its child elements apply; missing child elements are not inherited from the general declaration.
At most one occurrence per mode is allowed (including at most one without forMode).

Having said that if you disagree that the edge case is ambiguous or don't like this structure, what you have is also perfectly fine from my point of view so I'm happy if you want to merge either way.

@msdemlei
Copy link
Collaborator Author

msdemlei commented Oct 6, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants