Respect settle queue size by fafk · Pull Request #4338 · cowprotocol/services

fafk · 2026-04-15T16:40:35Z

Description

The reference driver rejects new solutions when there is already a backlog of solutions that still need to be submitted because they will most likely not be mined in time. This is intended to protect very competitive solvers from penalties when they win too much but can't submit fast enough.
#4167 introduced a bug where the check whether to reject the /solve request only looks at the available tx submission slots but not the settle queue.
This has the consequence that a solver with only a single submission EOA that won an auction will reject /solve requests until the previous solution was submitted.

Changes

Add a semaphore with capacity equal to queue size to mimic missing queue behavior.

MartinquaXD · 2026-04-16T09:16:55Z

+
+        let this = Arc::clone(self);
+        let tracing_span = tracing::Span::current();
+        let handle = tokio::spawn(


This needs a good comment explaining that we spawn the task so that we still cancel the submission after the autopilot already terminated the request (i.e. when the autopilot knows we didn't submit within the deadline before we do).
Otherwise this code just looks super strange. 😅

MartinquaXD · 2026-04-16T09:18:00Z

-        self.settle_queue.try_send(request).map_err(|err| {
-            tracing::warn!(?err, "Failed to enqueue /settle request");
+        let admission_permit = self.submitter_pool.try_admit().ok_or_else(|| {
+            tracing::warn!("no idle submission slots; settle request rejected");


This permission is only for getting into the queue, not yet for actually starting the submission process.

Suggested change

tracing::warn!("no idle submission slots; settle request rejected");

tracing::warn!("too many pending settlements; settle request rejected");

MartinquaXD · 2026-04-16T09:26:52Z

+            .await,
+    );
+
+    let solution_ids = join_all(vec![


Probably good to have a comment explaining that we can submit as many bids as we want as long as there is at least 1 settle queue spot available.

MartinquaXD · 2026-04-16T09:32:35Z

+    let mut admitted = 0;
+    let mut rejected = 0;
+    for result in &results {
+        match result.error_kind().as_deref() {
+            None | Some("FailedToSubmit") => admitted += 1,
+            Some("TooManyPendingSettlements") => rejected += 1,
+            Some(other) => panic!("unexpected error kind: {other}"),
+        }
+    }


If we sleep briefly (10ms) between the /settle calls does the outcome of the settle futures become deterministic? That way we could have a stricter test asserting the exact sequence of results:
None (success), "FailedToSubmit" (order already filled), "FailedToSubmit" (order already filled), "TooManyPendingSettlements", "TooManyPendingSettlements"

MartinquaXD · 2026-04-16T09:32:41Z

+/// requests to `pool_slots + settle_queue_size` (default 1 + 2 = 3).
+#[tokio::test]
+#[ignore]
+async fn admission_capacity_is_respected() {


Am I wrong or is this test a better version of discards_excess_settle_and_solve_requests? Can we delete the other one?

MartinquaXD · 2026-04-16T10:00:02Z

+        handle.await.map_err(|err| {
+            tracing::error!(?err, "settle task panicked");
            Error::SubmissionError
        })?


You probably already worked on the new code before I edited one of my previous comments but I think getting rid of the channels comes with an edge case.
If the block stream of the driver is stuck (i.e. it doesn't see new incoming blocks) the driver will never stop submitting the current txs and the settle queue will fill up.
Since we can consider the autopilot the source of truth we should probably keep the logic that continues polling the settle future for a second. Either the block stream is healthy and the driver will quickly see that it should try to cancel the tx, or it's not healthy and it will still not cancel the submission but at least it will free up the submission slot again so that it can theoretically submit new solutions going forward. (see #3427)

However, we can probably still keep the "grace period" idea but a bit simpler than what we had with the oneshot channels. Something like this should do the trick:

struct SettleTaskHandle<T: Send + 'static>(tokio::task::JoinHandle<T>); impl <T: Send + 'static> Drop for SettleTaskHandle<T> { fn drop(&mut self) { if self.0.is_finished() { return; } // continue polling the settle future for a short grace period // see <https://github.com/cowprotocol/services/pull/3427> let abort_handle = self.0.abort_handle(); tokio::task::spawn(async move { tokio::time::sleep(std::time::Duration::from_secs(1)).await; abort_handle.abort(); }); } }

Respect settle queue size

aaeda3e

fafk closed this Apr 15, 2026

github-actions bot locked and limited conversation to collaborators Apr 15, 2026

fafk reopened this Apr 15, 2026

Drop explicitly

c0b5e5c

fafk marked this pull request as ready for review April 16, 2026 05:45

fafk requested a review from a team as a code owner April 16, 2026 05:45

fafk added 2 commits April 16, 2026 09:10

Remove obsolete queue

e17074e

Add regression test

8f48157

cowprotocol unlocked this conversation Apr 16, 2026

Use queue for 7702 too

2314838

MartinquaXD reviewed Apr 16, 2026

View reviewed changes

Comment thread crates/driver/src/domain/competition/mod.rs Outdated

Comment thread crates/driver/src/domain/competition/mod.rs Outdated

MartinquaXD mentioned this pull request Apr 16, 2026

Also consider settle queue before rejecting solutions #4337

Closed

fafk added 3 commits April 16, 2026 10:16

Don't drop settlement mid-flight

6950eaf

Use queue even for 7702 mode

9d5a887

Move check outside of task

437d9f5

MartinquaXD reviewed Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Respect settle queue size#4338

Respect settle queue size#4338
fafk wants to merge 8 commits intomainfrom
respect-queue-size

fafk commented Apr 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

MartinquaXD Apr 16, 2026

Uh oh!

MartinquaXD Apr 16, 2026

Uh oh!

MartinquaXD Apr 16, 2026

Uh oh!

MartinquaXD Apr 16, 2026

Uh oh!

MartinquaXD Apr 16, 2026

Uh oh!

MartinquaXD Apr 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	tracing::warn!("no idle submission slots; settle request rejected");
	tracing::warn!("too many pending settlements; settle request rejected");

Conversation

fafk commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Uh oh!

Uh oh!

Uh oh!

MartinquaXD Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

MartinquaXD Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

MartinquaXD Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

MartinquaXD Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

MartinquaXD Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

MartinquaXD Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fafk commented Apr 15, 2026 •

edited

Loading

MartinquaXD Apr 16, 2026 •

edited

Loading