Update transaction pipelining docs re: parallelism

rmloveland · rmloveland · commit ce0234e76ad5 · 2025-09-09T16:04:13.000-04:00
Fixes DOC-8591 Specifically, we state that the cost of writes is ~`O(1)` in the number of inserts, which is misleading. The reality is that despite the parallelism introduced by pipelining, there is other work that happens for each write that does not come "for free".
diff --git a/src/current/v25.4/architecture/life-of-a-distributed-transaction.md b/src/current/v25.4/architecture/life-of-a-distributed-transaction.md
@@ -116,7 +116,7 @@ The batch evaluator ensures that write operations are valid. Our architecture ma
 
 If the write operation is valid according to the evaluator, the leaseholder sends a provisional acknowledgment to the gateway node's `DistSender`; this lets the `DistSender` begin to send its subsequent `BatchRequests` for this range.
 
-Importantly, this feature is entirely built for transactional optimization (known as [transaction pipelining]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-pipelining)). There are no issues if an operation passes the evaluator but doesn't end up committing.
+Importantly, this feature is entirely built for transactional optimization (known as [transaction pipelining]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-pipelining)). For important caveats about what pipelining does and does not change in end-to-end latency, see that section. There are no issues if an operation passes the evaluator but doesn't end up committing.
 
 ### Reads from the storage layer
 
diff --git a/src/current/v25.4/architecture/transaction-layer.md b/src/current/v25.4/architecture/transaction-layer.md
@@ -352,6 +352,10 @@ COMMIT;
 
 With transaction pipelining, write intents are replicated from leaseholders in parallel, so the waiting all happens at the end, at transaction commit time.
 
+{{site.data.alerts.callout_info}}
+Clarification: Transaction pipelining overlaps the Raft consensus work for intent writes across statements. It does not make individual SQL statements free. Each statement must still be planned and evaluated (e.g., index lookups, constraint checks, conflict detection, and waiting on contending writes), and the client submits statements sequentially. Statements that touch the same rows can also create pipeline stalls to preserve read-your-writes ordering. As a result, while the consensus component of write latency can approach O(1) with respect to the number of statements, end-to-end transaction latency can still scale with the number of statements due to SQL evaluation and dependency stalls.
+{{site.data.alerts.end}}
+
 At a high level, transaction pipelining works as follows:
 
 1. For each statement, the transaction gateway node communicates with the leaseholders (*L*<sub>1</sub>, *L*<sub>2</sub>, *L*<sub>3</sub>, ..., *L*<sub>i</sub>) for the ranges it wants to write to. Since the primary keys in the table above are UUIDs, the ranges are probably split across multiple leaseholders (this is a good thing, as it decreases [transaction conflicts](#transaction-conflicts)).
@@ -362,7 +366,7 @@ At a high level, transaction pipelining works as follows:
 
 1. When attempting to commit, the transaction gateway node then waits for the write intents to be replicated in parallel to all of the leaseholders' followers. When it receives responses from the leaseholders that the write intents have propagated, it commits the transaction.
 
-In terms of the SQL snippet shown above, all of the waiting for write intents to propagate and be committed happens once, at the very end of the transaction, rather than for each individual write. This means that the cost of multiple writes is not `O(n)` in the number of SQL DML statements; instead, it's `O(1)`.
+In terms of the SQL snippet shown above, all of the waiting for write intents to propagate and be committed happens once, at the very end of the transaction, rather than for each individual write. This means the consensus-related waiting is not `O(n)` in the number of SQL DML statements; instead, it approaches `O(1)`. The overall client-observed latency still includes per-statement planning and any pipeline stalls, so it does not, in general, become strictly `O(1)`.
 
 ### Parallel Commits