The current WIP Fluence sidecar design handles the common case of a hybrid quantum-classical workflow where one gateway pod submits a quantum task and N gated classical pods all process the same result. In practice, I think this is the most common use case - we launch some scaled quantum thing, and some number of following gated pods can receive the ARN to interact with the outputs as they please. In this current model, the sidecar discovers the task ARN, waits for queue position==1, and ungates all N classical pods simultaneously with the same ARN patched onto each.
The scatter problem
Thinking through this, a more advanced class of hybrid workflows requires a 1:1 pairing between quantum tasks and classical pods. More specifically, we actually want N probe pods, each of which submits a quantum task. The example I stumbled on was variational gradient estimation using the parameter-shift rule:
For a variational circuit with P parameters, computing the gradient requires 2P circuit evaluations — two circuits per parameter, each with a small perturbation. These circuits can be submitted in parallel and processed independently. This means that we have one task per pod, and one ARN per follow up classic pod that is ungated. Our current design does not handle this because we assume an initial probe pod, N=1, that is launching all the quantum work, to be processed by the ungated set after. In this case, each gateway pod should only ungate its specific paired classical pod with a specific task ARN.
Proposed design
Add a fluence.io/scatter: "true" annotation that activates scatter mode for a pod group (this would be applied by the user). That means each probe pod on trivial resources is paired with its classical counterpart. The fluence webhook would need to handle adding the proper annotations and (still) the sidecar to handle dispatch. I'll work on this after the braket sidecar and gateway pod is implemented. Specifically:
- A new
fluence.io/scatter annotation recognized by the webhook
- Index-based pod matching logic in the webhook
- Documentation and an e2e test for the scatter pattern
Note that this should also handle "embarrassingly parallel" cases, where we just need to run N copies of the same thing. In the case of orchestration of a set, it is on the user application to handle the eventual scale out of the group.
References
The current WIP Fluence sidecar design handles the common case of a hybrid quantum-classical workflow where one gateway pod submits a quantum task and N gated classical pods all process the same result. In practice, I think this is the most common use case - we launch some scaled quantum thing, and some number of following gated pods can receive the ARN to interact with the outputs as they please. In this current model, the sidecar discovers the task ARN, waits for queue position==1, and ungates all N classical pods simultaneously with the same ARN patched onto each.
The scatter problem
Thinking through this, a more advanced class of hybrid workflows requires a 1:1 pairing between quantum tasks and classical pods. More specifically, we actually want N probe pods, each of which submits a quantum task. The example I stumbled on was variational gradient estimation using the parameter-shift rule:
Proposed design
Add a
fluence.io/scatter: "true"annotation that activates scatter mode for a pod group (this would be applied by the user). That means each probe pod on trivial resources is paired with its classical counterpart. The fluence webhook would need to handle adding the proper annotations and (still) the sidecar to handle dispatch. I'll work on this after the braket sidecar and gateway pod is implemented. Specifically:fluence.io/scatterannotation recognized by the webhookNote that this should also handle "embarrassingly parallel" cases, where we just need to run N copies of the same thing. In the case of orchestration of a set, it is on the user application to handle the eventual scale out of the group.
References