Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .changeset/hungry-cloths-hunt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
'@pgflow/core': patch
---

Improve failure handling and prevent orphaned messages in queue

- Archive all queued messages when a run fails to prevent resource waste
- Handle type constraint violations gracefully without exceptions
- Store output on failed tasks (including type violations) for debugging
- Add performance index for efficient message archiving
- Prevent retries on already-failed runs
- Update table constraint to allow output storage on failed tasks
17 changes: 17 additions & 0 deletions .claude/commands/fix-sql-tests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Your job is to fix SQL tests, either by fixing the tests if those are invalid,
or updating the SQL functions in pkgs/core/schemas/ and trying again.

If updating functions, load them with psql.

!`pnpm nx supabase:status core --output env | grep DB_URL`
PWD: !`pwd`

To rerun the test(s), run this command from `pkgs/core` directory:

`scripts/run-test-with-colors supabase/tests/<testfile>`

Do not create any migratons or try to run tests with nx.

<test_failures>
!`pnpm nx test:pgtap core`
</test_failures>
40 changes: 40 additions & 0 deletions .claude/commands/test-first-sql.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
Your job is to implement the feature below in a test-first manner.
First, you must idenfity what things you want to test for.
Then you must write one test at a time, from the simplest, more generic,
to more precise (if applicable, sometimes you only need to write one test per
thing, without multiple per thing).

To run the test(s), run this command from `pkgs/core` directory:

`scripts/run-test-with-colors supabase/tests/<testfile>`

The newly written test must fail for the correct reasons.

In order to make the test pass, you need to update function
code in pkgs/core/schemas/.

After updating you should use `psql` to execute function file
and update function in database.

!`pnpm nx supabase:status core --output env | grep DB_URL`
PWD: !`pwd`

Repeat until all the added tests are passing.

When they do, run all the tests like this:

`scripts/run-test-with-colors supabase/tests/`

Do not create any migratons or try to run tests with nx.

Never use any INSERTs or UPDATEs to prepare or mutate state for the test.
Instead, use regular pgflow.\* SQL functions or functions that are
available in pkgs/core/supabase/tests/seed.sql:

!`grep 'function.*pgflow_tests' pkgs/core/supabase/seed.sql -A7`

Check how they are used in other tests.

<feature_to_implement>
$ARGUMENTS
</feature_to_implement>
72 changes: 36 additions & 36 deletions PLAN.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
- ✅ **DONE**: Array element extraction - tasks receive individual array elements
- ✅ **DONE**: Output aggregation - inline implementation aggregates map task outputs for dependents
- ✅ **DONE**: DSL support for `.map()` for defining map steps with compile-time duplicate detection
- **TODO**: Fix orphaned messages on run failure
- **DONE**: Fix orphaned messages on run failure
- ⏳ **TODO**: Performance optimization with step_states.output column

### Chores
Expand Down Expand Up @@ -106,18 +106,28 @@
- Updated DSL README with .map() documentation
- Created detailed changeset

- [x] **PR #219: Fix Orphaned Messages on Run Failure** - `09-18-fix-orphaned-messages-on-fail` ✅ COMPLETED

- Archives all queued messages when run fails (prevents orphaned messages)
- Handles type constraint violations gracefully without exceptions
- Added guards to prevent any mutations on failed runs:
- complete_task returns unchanged
- start_ready_steps exits early
- cascade_complete_taskless_steps returns 0
- Added performance index for efficient message archiving
- Tests unstashed and passing (archive_sibling_map_tasks, archive_messages_on_type_constraint_failure)
- Updated core README with failure handling mentions
- **Critical fix: prevents queue performance degradation in production**

#### ❌ Remaining Work (Priority Order)

- [ ] **Priority 1: Fix Orphaned Messages on Run Failure** 🚨 CRITICAL
- [ ] **Integration Tests**

- Archive all pending messages when run fails
- Handle map sibling tasks specially
- Fix type constraint violations to fail immediately without retries
- See detailed plan: [PLAN_orphaned_messages.md](./PLAN_orphaned_messages.md)
- **Critical for production: prevents queue performance degradation**
- Tests already written (stashed) that document the problem
- End-to-end workflows with real array data
- Basic happy path coverage
- This should be minimal and added to the Edge Worker integration test suite for now

- [ ] **Priority 2: Performance Optimization - step_states.output Column**
- [ ] **Performance Optimization - step_states.output Column**

- Migrate from inline aggregation to storing outputs in step_states
- See detailed plan: [PLAN_step_output.md](./PLAN_step_output.md)
Expand All @@ -132,43 +142,33 @@
- Update all aggregation tests (~17 files)
- **Note**: This is an optimization that should be done after core functionality is stable

- [ ] **Priority 3: Integration Tests**

- End-to-end workflows with real array data
- Basic happy path coverage
- This should be minimal and added to the Edge Worker integration test suite for now

- [ ] **Priority 4: Update core README**

- `pkgs/core/README.md`

- Add new section describing the step types
- Describe single step briefly, focus on describing map step type and how it differs
- Make sure to mention that maps are constrained to have exactly one dependency
- Show multiple cases of inputs -> task creation
- Explain edge cases (empty array propagation, invalid array input)
- Explain root map vs dependent map and how it gets handled and what restrictions those apply on the Flow input
- Explain cascade completion of taskless steps and its limitations
- [ ] **Update `pkgs/core/README.md`**

- [ ] **Priority 5: Add docs page**
- Add new section describing the step types
- Describe single step briefly, focus on describing map step type and how it differs
- Make sure to mention that maps are constrained to have exactly one dependency
- Show multiple cases of inputs -> task creation
- Explain edge cases (empty array propagation, invalid array input)
- Explain root map vs dependent map and how it gets handled and what restrictions those apply on the Flow input
- Explain cascade completion of taskless steps and its limitations

- **Add basic docs page**
- [ ] **Add docs page**

- put it into `pkgs/website/src/content/docs/concepts/array-and-map-steps.mdx`
- describe the DSL and how the map works and why we need it
- show example usage of root map
- show example usage of dependent map
- focus mostly on how to use it, instead of how it works under the hood
- link to the README's for more details
- put it into `pkgs/website/src/content/docs/concepts/array-and-map-steps.mdx`
- describe the DSL and how the map works and why we need it
- show example usage of root map
- show example usage of dependent map
- focus mostly on how to use it, instead of how it works under the hood
- link to the README's for more details

- [ ] **Priority 6: Migration Consolidation** (Do this last before merge!)
- [ ] **Migration Consolidation**

- Remove all temporary/incremental migrations from feature branches
- Generate a single consolidated migration for the entire map infrastructure
- Ensure clean migration path from current production schema
- If NULL improvement is done, include it in the consolidated migration

- [ ] **Priority 7: Graphite Stack Merge**
- [ ] **Graphite Stack Merge**

- Configure Graphite merge queue for the complete PR stack
- Ensure all PRs in sequence can be merged together
Expand Down
184 changes: 0 additions & 184 deletions PLAN_orphaned_messages.md

This file was deleted.

Loading
Loading