Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
198 changes: 198 additions & 0 deletions TEMPLATES.md
Original file line number Diff line number Diff line change
Expand Up @@ -5710,6 +5710,204 @@ _No details available yet._
</details>
</details>

## Sharded Apply

### PR Comments

<details>
<summary><a name="plan-divergent-shards"></a><strong>Plan: Divergent Shards</strong></summary>


## Schema Change Plan — Production

**Database**: `cdb_resolute` | **Type**: `Strata`

*Requested by @jackjackbits at 2026-01-01 00:00:00 UTC · planned from [`abcdef1`](https://github.com/block/schemabot/commit/abcdef1234567890abcdef1234567890abcdef12)*

#### Keyspace: `cdb_resolute_sharded`
Shards diverge — what applies where:

**shards `-40`, `80-c0`, `c0-`**

```sql
ALTER TABLE `mutes` ADD INDEX `created_at`(`created_at`);
```

**shard `40-80`**

```sql
ALTER TABLE `mutes`
ADD INDEX `created_at`(`created_at`),
ADD COLUMN `reason` varchar(255);
```

📋 **Plan**: **1** table to alter


---

💡 **To apply** all schema changes from this PR, comment:
```
schemabot apply -e production
```

</details>

<details>
<summary><a name="plan-unsafe-change-on-one-shard"></a><strong>Plan: Unsafe Change On One Shard</strong></summary>


## Schema Change Plan — Production

**Database**: `cdb_resolute` | **Type**: `Strata`

*Requested by @jackjackbits at 2026-01-01 00:00:00 UTC · planned from [`abcdef1`](https://github.com/block/schemabot/commit/abcdef1234567890abcdef1234567890abcdef12)*

#### Keyspace: `cdb_resolute_sharded`
Shards diverge — what applies where:

**shards `-40`, `80-c0`, `c0-`**

```sql
ALTER TABLE `mutes` ADD INDEX `created_at`(`created_at`);
```

**shard `40-80`**

```sql
ALTER TABLE `mutes`
ADD INDEX `created_at`(`created_at`),
DROP COLUMN `legacy_reason`;
```

**⛔ Unsafe Changes Detected:**
- `mutes` (shard `40-80`): DROP COLUMN removes data and is irreversible

**Destructive drop guidance:**

Before allowing a destructive drop, first deploy application code that no longer reads from or writes to the dropped column.

📋 **Plan**: 2 DDL statements


---

💡 **To apply** all schema changes from this PR, comment:
```
schemabot apply -e production
```

</details>

<details>
<summary><a name="apply-in-progress"></a><strong>Apply In Progress</strong></summary>


## Schema Change In Progress — Production

**Database**: `cdb_resolute` | **Type**: `Strata` | **Apply ID**: `apply-a1b2c3d4e5f6`

*Applied by @jackjackbits at 2026-01-01 00:00:00 UTC*

**Shards**: 1 running table copy, 3 waiting for -40

#### Keyspace `cdb_resolute_sharded`

| Shard | Status |
| --- | --- |
| `-40` | 🔄 running table copy |
| `40-80` | ⏳ waiting for -40 |
| `80-c0` | ⏳ waiting for -40 |
| `c0-` | ⏳ waiting for -40 |

`mutes`
```sql
ALTER TABLE `mutes` ADD INDEX `created_at`(`created_at`);
```

_Last updated: <relative-time datetime="2026-01-01T00:00:00Z">2026-01-01 00:00:00 UTC</relative-time> (2026-01-01 00:00:00 UTC)_

</details>

<details>
<summary><a name="apply-failed-one-shard-failed"></a><strong>Apply Failed (One Shard Failed)</strong></summary>


## ❌ Schema Change Failed — Production

**Database**: `cdb_resolute` | **Type**: `Strata` | **Apply ID**: `apply-a1b2c3d4e5f6`

*Applied by @jackjackbits at 2026-01-01 00:00:00 UTC*

**Shards**: 1 failed, 3 halted

> ⚠️ **First failure:** shard <code>-40</code> — resolve shard primary for `-40`: context deadline exceeded

#### Keyspace `cdb_resolute_sharded`

| Shard | Status |
| --- | --- |
| `-40` | ❌ failed — resolve shard primary for `-40`: context deadline exceeded |
| `40-80` | ⏸ halted — -40 failed |
| `80-c0` | ⏸ halted — -40 failed |
| `c0-` | ⏸ halted — -40 failed |

`mutes`
```sql
ALTER TABLE `mutes` ADD INDEX `created_at`(`created_at`);
```

---

To retry:
```
schemabot apply -e production
```

</details>

<details>
<summary><a name="apply-with-divergent-shards"></a><strong>Apply With Divergent Shards</strong></summary>


## Schema Change In Progress — Production

**Database**: `cdb_resolute` | **Type**: `Strata` | **Apply ID**: `apply-a1b2c3d4e5f6`

*Applied by @jackjackbits at 2026-01-01 00:00:00 UTC*

**Shards**: 1 running table copy, 2 waiting for -40

#### Keyspace `cdb_resolute_sharded`

Shards diverge — grouped by change:

**shards `-40`, `80-c0`**

| Shard | Status |
| --- | --- |
| `-40` | 🔄 running table copy |
| `80-c0` | ⏳ waiting for -40 |

`mutes`
```sql
ALTER TABLE `mutes` ADD INDEX `created_at`(`created_at`);
```

**shard `40-80`**

| Shard | Status |
| --- | --- |
| `40-80` | ⏳ waiting for -40 |

`mutes`
```sql
ALTER TABLE `mutes` ADD INDEX `created_at`(`created_at`), ADD COLUMN `reason` varchar(255);
```

_Last updated: <relative-time datetime="2026-01-01T00:00:00Z">2026-01-01 00:00:00 UTC</relative-time> (2026-01-01 00:00:00 UTC)_
</details>

## Multi-Deployment Apply (CLI)

<details>
Expand Down
29 changes: 29 additions & 0 deletions pkg/api/proto_helpers.go
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,35 @@ func planResponseFromProto(resp *ternv1.PlanResponse) *apitypes.PlanResponse {
})
}

// Carry per-shard changes so the plan comment can show what applies to which
// shard. The namespace-level Changes above collapse a divergent keyspace to a
// single entry; the per-shard detail is preserved only here.
for _, sp := range resp.Shards {
if sp == nil {
continue
}
apiSP := &apitypes.ShardPlanResponse{Namespace: sp.Namespace, Shard: sp.Shard}
for _, t := range sp.Changes {
if t == nil {
continue
}
apiSP.Changes = append(apiSP.Changes, &apitypes.TableChangeResponse{
TableName: t.TableName,
Namespace: t.Namespace,
DDL: t.Ddl,
ChangeType: protoChangeTypeToOperation(t.ChangeType),
IsUnsafe: t.IsUnsafe,
UnsafeReason: t.UnsafeReason,
})
}
// A shard is changing iff it carries changes (the proto contract); drop an
// empty shard plan so it never renders a blank shard section downstream.
if len(apiSP.Changes) == 0 {
continue
}
httpResp.Shards = append(httpResp.Shards, apiSP)
}

return httpResp
}

Expand Down
13 changes: 13 additions & 0 deletions pkg/apitypes/apitypes.go
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,19 @@ type PlanResponse struct {
Changes []*SchemaChangeResponse `json:"changes"`
LintResults []*LintViolationResponse `json:"lint_violations"`
Errors []string `json:"errors"`
// Shards carries the per-shard plan for a sharded engine: each changing shard
// and the changes it needs. The namespace-level Changes above collapse a
// keyspace to one entry, so a keyspace whose shards diverge is represented
// faithfully only here. Empty for non-sharded plans.
Shards []*ShardPlanResponse `json:"shards,omitempty"`
}

// ShardPlanResponse is one changing shard's plan: the keyspace it belongs to and
// the table changes that shard needs.
type ShardPlanResponse struct {
Namespace string `json:"namespace,omitempty"`
Shard string `json:"shard"`
Changes []*TableChangeResponse `json:"changes,omitempty"`
}

// HasErrors returns true if any lint result has error severity.
Expand Down
1 change: 1 addition & 0 deletions pkg/cmd/commands/preview.go
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ func (cmd *PreviewCmd) Run(g *Globals) error {
templates.PreviewCommentMultiDeployCompleted, templates.PreviewCommentMultiDeployAll,
templates.PreviewCLIMultiDeployInProgress, templates.PreviewCLIMultiDeployFailed,
templates.PreviewCLIMultiDeployCompleted, templates.PreviewCLIMultiDeployAll,
templates.PreviewCommentShardedAll,
templates.PreviewCommentSingleProgress, templates.PreviewCommentSingleComplete,
templates.PreviewCommentSingleFailed, templates.PreviewCommentSingleStopped,
templates.PreviewCommentSummaryCompleted, templates.PreviewCommentSummaryFailed,
Expand Down
1 change: 1 addition & 0 deletions pkg/cmd/internal/templates/preview.go
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ const (
PreviewCLIMultiDeployFailed PreviewType = "cli_multi_deploy_failed" // Halt-on-failure: one deployment failed
PreviewCLIMultiDeployCompleted PreviewType = "cli_multi_deploy_completed" // All deployments completed
PreviewCLIMultiDeployAll PreviewType = "cli_multi_deploy_all" // Show all CLI multi-deployment apply previews
PreviewCommentShardedAll PreviewType = "comment_sharded_all" // Show all sharded apply + plan previews

// Single-table apply comment previews (most common case)
PreviewCommentSingleProgress PreviewType = "comment_single_progress" // Single table running
Expand Down
14 changes: 14 additions & 0 deletions pkg/cmd/internal/templates/preview_comment.go
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,20 @@ func previewCommentMultiDeployAllOutput() {
printSections(sections)
}

func previewCommentShardedAllOutput() {
sections := []struct {
name string
fn func()
}{
{"PLAN: DIVERGENT SHARDS", func() { fmt.Print(webhooktemplates.PreviewCommentShardedPlanDivergent()) }},
{"PLAN: UNSAFE CHANGE ON ONE SHARD", func() { fmt.Print(webhooktemplates.PreviewCommentShardedPlanUnsafe()) }},
{"APPLY IN PROGRESS", func() { fmt.Print(webhooktemplates.PreviewCommentShardedApplyInProgress()) }},
{"APPLY FAILED (ONE SHARD FAILED)", func() { fmt.Print(webhooktemplates.PreviewCommentShardedApplyFailed()) }},
{"APPLY WITH DIVERGENT SHARDS", func() { fmt.Print(webhooktemplates.PreviewCommentShardedApplyDivergent()) }},
}
printSections(sections)
}

func previewCLIPlanAllOutput() {
sections := []struct {
name string
Expand Down
2 changes: 2 additions & 0 deletions pkg/cmd/internal/templates/preview_dispatch.go
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,8 @@ func PreviewCLIOutput(previewType PreviewType) {
fmt.Print(webhooktemplates.PreviewCommentMultiDeploymentApplyCompleted())
case PreviewCommentMultiDeployAll:
previewCommentMultiDeployAllOutput()
case PreviewCommentShardedAll:
previewCommentShardedAllOutput()
case PreviewCLIMultiDeployInProgress:
previewCLIMultiDeploymentApplyInProgress()
case PreviewCLIMultiDeployFailed:
Expand Down
27 changes: 26 additions & 1 deletion pkg/tern/grpc_client.go
Original file line number Diff line number Diff line change
Expand Up @@ -1901,6 +1901,22 @@ func (c *GRPCClient) dispatchPendingApply(ctx context.Context, apply *storage.Ap
return err
}

// Fail closed before dispatch when a shard-scoped operation resolves no target
// shard. A shard work operation (key "namespace/shard/table") must dispatch
// exactly one shard; if its tasks carry no shard the dispatch would send an
// empty TargetShards and the data plane would reject it opaquely with
// "expected exactly one target shard, got 0". Surfacing it here — as a clear
// control-plane error — turns a version/data skew into an actionable message
// instead of a confusing data-plane failure.
targetShards := taskTargetShards(tasks)
if scope.operation != nil && isShardWorkOperationKey(scope.operation.OperationKey) && len(targetShards) != 1 {
errMsg := fmt.Sprintf("queued gRPC apply failed: shard operation %q resolved %d target shards, expected exactly 1 — its tasks carry no shard, so refusing to dispatch (the data plane would reject with \"expected exactly one target shard, got 0\"); this indicates a version or data skew", scope.operation.OperationKey, len(targetShards))
if markErr := c.markRemoteApplyFailed(ctx, apply, nil, errMsg, false, scope); markErr != nil {
return fmt.Errorf("mark queued gRPC apply %s failed after shard-scope guard: %w", apply.ApplyIdentifier, markErr)
}
return fmt.Errorf("queued gRPC apply %s: %s", apply.ApplyIdentifier, errMsg)
}

// Use the per-operation copy-drive options so a multi-operation barrier
// deployment parks the remote engine at the cutover barrier instead of
// running straight through the swap. effectiveCopyDriveOptions OR's
Expand All @@ -1927,7 +1943,7 @@ func (c *GRPCClient) dispatchPendingApply(ctx context.Context, apply *storage.Ap
Environment: apply.Environment,
Target: target,
Caller: apply.Caller,
TargetShards: taskTargetShards(tasks),
TargetShards: targetShards,
IdempotencyKey: remoteApplyIdempotencyKey(apply, scope),
})
if err != nil {
Expand Down Expand Up @@ -2112,6 +2128,15 @@ func tasksToProtoTableChanges(tasks []*storage.Task) []*ternv1.TableChange {
return changes
}

// isShardWorkOperationKey reports whether an operation key is a sharded work
// key ("namespace/shard/table") — the per-shard fan-out's unit. A whole-apply
// key (empty) and a finalizer key ("namespace/group_finalizer") are not, so the
// shard-scope guard applies only to per-shard work.
func isShardWorkOperationKey(key string) bool {
parts := strings.Split(key, "/")
return len(parts) == 3 && parts[0] != "" && parts[1] != "" && parts[2] != ""
}

func taskTargetShards(tasks []*storage.Task) []string {
seen := make(map[string]struct{})
var shards []string
Expand Down
Loading
Loading