Adding a facility to express per-mode limits #8

msdemlei · 2025-09-15T07:56:33Z

This PR suggests limit groups with which operators can override the default limit specifications (e.g., async has a lot more time). This came out of astropy/pyvo#685. Actually, it would be simple to add an expression of a preferred mode for a given service on top of this. The discussion in the bug seems to indicate that there is little use for that, but I am not too convinced. With a bit of encouragement, I'd try this.

gmantele · 2025-09-15T08:28:39Z

TAPRegExt-v1.1.xsd

-              Limits on the time between job creation and
-              destruction time.
+              Limits for how long a service will retain async jobs
+              for the service's default access


I am not a big fan of the idea of "default access mode", but in order to stay backward compatible, there is not other choice.

However, you speak about "retaining async jobs", but the example of "typical default access mode" is "anonymous-sync". It does not quite work here, isn't it?

Then, where do you specify what is the default access mode?

Maybe the best thing to do is not to give example of default mode...

msdemlei · 2025-09-16T15:39:22Z

On Mon, Sep 15, 2025 at 01:34:31AM -0700, Grégory Mantelet wrote: @gmantele commented on this pull request. > @@ -82,8 +82,9 @@ xsi:schemaLocation="http://www.w3.org/2001/XMLSchema http://vo.ari.uni-heidelber minOccurs="0" maxOccurs="1"> <xs:annotation> <xs:documentation> - Limits on the time between job creation and - destruction time. + Limits for how long a service will retain async jobs + for the service's default access I am not a big fan of the idea of "default access mode", but in order to stay backward compatible, there is not other choice.

Well... we could define that these are always for anonymous access at least, but that still won't let us off the hook on this:

However, you speak about "retaining async jobs", but the example of "typical default access mode" is "anonymous-sync". It does not quite work here, isn't it?

Good point. Does anyone have a good idea for language to convey "globally define `typical' limits"?

Then, where do you specify what is the default access mode?

We don't, at this point. That's one of the open questions I mention in the PR description. Perhaps we can say "anon sync if applicable, authenticated sync for TAP services requiring authentication"?

> @@ -82,8 +82,9 @@ xsi:schemaLocation="http://www.w3.org/2001/XMLSchema http://vo.ari.uni-heidelber minOccurs="0" maxOccurs="1"> <xs:annotation> <xs:documentation> - Limits on the time between job creation and - destruction time. + Limits for how long a service will retain async jobs + for the service's default access Maybe the best thing to do is not to give example of default mode...

Perhaps "will retain anonymous async jobs"? But what do required-auth services do then? All this makes me suspect that we are missing a much better idea. Here's what my gospel has to say about this: $ python -c "import this" | grep explain If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Of course, sometimes a bad idea is better than nothing at all...

mbtaylor · 2025-09-17T09:26:45Z

There's rather a lot going on in this PR that is unrelated to the limits business, but I think I found the relevant parts.

Repeating the limit elements in a PerModeLimits looks a bit clunky, though I may not have any better ideas. One other possibility would be adding an optional mode attribute to the four limit elements allowing them to indicate what mode they apply to, and allowing multiple instances of each of those elements, with no PerModeLimits element. Older clients might get confused by seeing multiple limit elements, but would probably just pick the value for one mode, and since it's not specified at present what mode that applies to they won't be grossly misinformed. However, I'm not necessarily saying that would be better than the proposed structure.

As far as wording goes, rather than talking about a default access mode and suggesting a "typical" assignment for what that is, I'd be inclined just to call the limit "default", e.g. "Default limits on the size of uploaded data" rather than "Limits on the size of uploaded data for the service's default access mode (typically anonymous-sync)". Services that don't want to distinguish can then just put the one limit in as now, but those which have per-mode limits can add mode-specific instances to cover the non-default case(s). From a client's point of view, it would just pick up the default limit, then override that for the current access mode if a suitable mode-specific one is present.

Limits for "auth-sync" and "auth-async" are likely to be too blunt an instrument for some services, since the limits in force may depend on your authenticated identity. The existing TAPRegExt 1.0 text (modified slightly by some earlier post-1.0 commit, and then deleted by this PR) already tackles this:

If a service supports authentication and has different limits depending on what user is authenticated, it should return the limits applying to the logged user.

I would therefore suggest reinstating the above text and restricting the modes to "sync" and "async". I admit that doesn't cover registry content, but it does take care of giving clients reading the capabilities endpoint an accurate idea of limits.

msdemlei · 2025-09-24T09:57:14Z

On Wed, Sep 17, 2025 at 02:27:07AM -0700, Mark Taylor wrote: There's rather a lot going on in this PR that is unrelated to the limits business, but I think I found the relevant parts.

Uh, sorry about that; there was some housekeeping to do as I did the per-mode thing and I got lazy.

though I may not have any better ideas. One other possibility would be adding an optional mode attribute to the four limit elements allowing them to indicate what mode they apply to, and allowing multiple instances of each of those elements, with no PerModeLimits element. Older clients *might* get confused by

This sounds like a reasonable thing in particular *if* we only go for sync and async as modes. It looks less reasonable if we keep four modes and half expect that more may come.

As far as wording goes, rather than talking about a default access mode and suggesting a "typical" assignment for what that is, I'd be inclined just to call the limit "default", e.g. "Default limits on

Ok; that's what is in in commit 672701b.

Limits for "auth-sync" and "auth-async" are likely to be too blunt an instrument for some services, since the limits in force may depend on your authenticated identity. The existing TAPRegExt 1.0 text (modified slightly by some earlier post-1.0 commit, and then deleted by this PR) already tackles this: > If a service supports authentication and has different limits > depending on what user is authenticated, it should return the > limits applying to the logged user. I would therefore suggest reinstating the above text and restricting the modes to "sync" and "async". I admit that doesn't cover registry content, but it does take care of giving clients reading the capabilities endpoint an accurate idea of limits.

Hm. I see your point. On the other hand, I kind of like the idea that people can figure out what they'd win if the logged in, and for that you would need the four modes. I think the way to decide where to go is: Do people actually enforce different limits according to who is logged in? If that is true, we would by lying in auth-sync and auth-async at least for these services. Then there is no (or negative) benefit in having them, and we should drop the auth-X modes. Soooo: Are you aware of services that do per-user limits or plan to do that? Rubin folks: Do you think about that?

stvoutsin · 2025-09-24T18:38:22Z

TAPRegExt-v1.1.xsd

    </xs:simpleContent>
  </xs:complexType>

+	<xs:complexType name="PerModeLimits">


There's one per-mode use case that I'm thinking about that doesn't logically fit within PerModeLimits: output formats with mode-specific technical constraints. For example, we're considering VOParquet support exclusively for async queries, since partitioning large result sets across multiple files isn't feasible within sync queries.

What would be the optimal design if we wan to consider this? Should we rename PerModeLimits to PerModeCapabilities to encompass both limits and format capabilities, or would per-mode formats warrant a separate grouping?

msdemlei · 2025-09-25T11:56:10Z

On Wed, Sep 24, 2025 at 11:38:44AM -0700, stvoutsin wrote: What would be the optimal design if we wan to consider this? Should we rename `PerModeLimits` to `PerModeCapabilities` to encompass both limits and format capabilities, or would per-mode formats warrant a separate grouping?

Hm. I think that's another argument for Mark's design with a mode attribute on selected (repeatable) elements. <outputFormat mode="async"...>? Hm. Or, to be future-proof, make this an element so you can give multiple modes in case we ever grow more of them? <outputFormat> <mode>async</mode> <mode>async2</mode>? Hm. What would be the list of elements for which we need mode, then? Is this supportable in clients? Mood report: I'm increasingly leaning towards suppering PerModeLimits and have the mode attribute or element. Other mood reports, anyone?

pdowler · 2025-09-25T16:38:01Z

I feel like we don't really know which features we describe might be mode-specific so I think I also favour a mode attribute.

I think I would lean toward the simplest: an optional attribute on some (all?) child elements. That way, no attribute present means the same thing it does now and elements would not need to be repeated for both modes. For some mode-specific option, eg:

    <outputFormat mode="async">
      <mime>application/vnd.apache.parquet</mime>
      <alias>parquet</alias>
    </outputFormat>

a client that did not grok the mode attribute would interpret that as applying to both modes. If that's a problem we need to solve, then this approach is probably not sufficient....

I would like to see a realistic example of where a repeatable mode element would really be better than a mode attribute... it looks to me like attribute is sufficient (assuming we are ok with allowing multiple of al child elements).

pdowler · 2025-09-25T16:41:25Z

Would we (on principle) add the mode attribute to <language>? I do not have a use case in mind... just thinking about future-proofing.

What about it's children: <languageFeatures>? individual <feature>s?? I feel like this would be going to far...

pdowler · 2025-09-25T16:52:44Z

In general, I feel like auth vs anon should be orthogonal to all features, but we do have one scenario where that is not the case: we support a custom DEST parameter so the user can specify that the output be sent to some external location (either an http URL that accepts PUT or a vos URI where we send the output to the specified location in a VOSpace service).

I do not know if this is a viable optional feature (param) that could be added to TAP or not, but if it did it would be an async specific feature (because sync inherently returns the output, while async stores it) and for practical reasons it requires auth to be able to perform the PUT (as the user).

To be clear: I do not think auth should be treated as a mode and responding with a "permission denied" (and auth challenges) if the job cannot write the output to DEST is probably fine, but a client can't really figure out how to make such a feature work without trial and error. Over in DataLink we added some predictive options so clients could tell what they might need to use an access_url, which seems related to this...

I am happy to consider this out-of-scope for now, just FYI.

I am removing the in-document example; it seems a waste of paper, and with auxiliaryurl we now have a better alternative.

Also, adding tests and updating the CI workflow.

gmantele · 2025-10-02T09:40:47Z

Would we (on principle) add the mode attribute to <language>? I do not have a use case in mind... just thinking about future-proofing.

What about it's children: <languageFeatures>? individual <feature>s?? I feel like this would be going to far...

I am not sure about setting this forMode attribute on language. Fundamentally, there should be no difference between sync and async (except maybe if we favor a very consuming ADQL feature for the async mode...but I clearly don't want to encourage this).

I would however see a possible interest between the auth and anon mode ; you could have some features only reserved for an authenticated community (e.g. a special catalogue accessible in auth mode only and which would require some special UDFs). But I think that this is out of topic.

gmantele reviewed Sep 15, 2025

View reviewed changes

msdemlei force-pushed the per-mode-limits branch from e8cbc38 to d832e48 Compare September 22, 2025 12:41

stvoutsin reviewed Sep 24, 2025

View reviewed changes

This was referenced Sep 29, 2025

Housekeeping #9

Merged

Add a forMode attribute to several elements #10

Open

msdemlei added 3 commits September 30, 2025 14:16

Updating example for per-mode limits.

52c9a03

I am removing the in-document example; it seems a waste of paper, and with auxiliaryurl we now have a better alternative.

Adding a limits element.

4873d53

Also, adding tests and updating the CI workflow.

Purging the notion of a 'default access mode'

845d03e

msdemlei force-pushed the per-mode-limits branch from d832e48 to 845d03e Compare September 30, 2025 12:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding a facility to express per-mode limits #8

Adding a facility to express per-mode limits #8

Uh oh!

msdemlei commented Sep 15, 2025

Uh oh!

gmantele Sep 15, 2025

Uh oh!

gmantele Sep 15, 2025

Uh oh!

msdemlei commented Sep 16, 2025 via email

Uh oh!

mbtaylor commented Sep 17, 2025

Uh oh!

msdemlei commented Sep 24, 2025 via email

Uh oh!

stvoutsin Sep 24, 2025

Uh oh!

msdemlei commented Sep 25, 2025 via email

Uh oh!

pdowler commented Sep 25, 2025

Uh oh!

pdowler commented Sep 25, 2025

Uh oh!

pdowler commented Sep 25, 2025

Uh oh!

gmantele commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Adding a facility to express per-mode limits #8

Are you sure you want to change the base?

Adding a facility to express per-mode limits #8

Uh oh!

Conversation

msdemlei commented Sep 15, 2025

Uh oh!

gmantele Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

gmantele Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

msdemlei commented Sep 16, 2025 via email

Uh oh!

mbtaylor commented Sep 17, 2025

Uh oh!

msdemlei commented Sep 24, 2025 via email

Uh oh!

stvoutsin Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

msdemlei commented Sep 25, 2025 via email

Uh oh!

pdowler commented Sep 25, 2025

Uh oh!

pdowler commented Sep 25, 2025

Uh oh!

pdowler commented Sep 25, 2025

Uh oh!

gmantele commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants