Skip to content

Conversation

jumski
Copy link
Contributor

@jumski jumski commented Sep 16, 2025

Enhanced Map Step Input Processing

  • Updated start_tasks function to build step input conditionally based on step type
  • Implemented logic for root and dependent map steps to extract array elements
  • Added a new migration script to handle array elements in start_tasks
  • Included comprehensive tests for dependent map element extraction, large array processing, mixed JSON types, and nested arrays
  • Improved handling of array-based tasks and input construction for various step configurations

Copy link

changeset-bot bot commented Sep 16, 2025

⚠️ No Changeset found

Latest commit: 781e413

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

coderabbitai bot commented Sep 16, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 09-16-handle-arrays-in-start-tasks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jumski jumski marked this pull request as ready for review September 16, 2025 19:53
Copy link

nx-cloud bot commented Sep 16, 2025

View your CI Pipeline Execution ↗ for commit 781e413

Command Status Duration Result
nx run-many -t build --projects client,dsl --co... ✅ Succeeded 4s View ↗
nx affected -t build --configuration=production... ✅ Succeeded 4s View ↗
nx affected -t lint typecheck test --parallel -... ✅ Succeeded 5m 54s View ↗

☁️ Nx Cloud last updated this comment at 2025-09-17 06:39:09 UTC

@jumski jumski force-pushed the 09-16-handle-arrays-in-start-tasks branch 3 times, most recently from ce60c8e to a462a29 Compare September 17, 2025 04:39
Comment on lines 352 to 353
(select flow_start_time_ms <= task_creation_time_ms * 1.5
from map_performance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable comparison appears to be reversed. The test is checking if array processing scales linearly, but the variables in the comparison don't match their content:

select ok(
  (select flow_start_time_ms <= task_creation_time_ms * 1.5
   from map_performance
   where array_size = -1),
  'Array processing MUST scale linearly (time_ratio <= size_ratio * 1.5)'
);

In the sentinel row (where array_size = -1), flow_start_time_ms contains the time ratio and task_creation_time_ms contains the size ratio. The comparison should be:

select ok(
  (select time_ratio <= size_ratio * 1.5
   from map_performance
   where array_size = -1),
  'Array processing MUST scale linearly (time_ratio <= size_ratio * 1.5)'
);

Or using the existing column names:

select ok(
  (select flow_start_time_ms <= task_creation_time_ms * 1.5
   from map_performance
   where array_size = -1),
  'Array processing MUST scale linearly (time_ratio <= size_ratio * 1.5)'
);
Suggested change
(select flow_start_time_ms <= task_creation_time_ms * 1.5
from map_performance
(select time_ratio <= size_ratio * 1.5
from map_performance

Spotted by Diamond

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +133 to +135
(SELECT jsonb_array_element(value, st.task_index)
FROM jsonb_each(dep_out.deps_output)
LIMIT 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical bug: The subquery uses LIMIT 1 without ORDER BY, making the result non-deterministic when a map step has multiple dependencies. While the comment states 'Map steps have exactly 1 dependency (enforced by add_step)', this creates a race condition if that constraint is ever violated or if there are concurrent modifications. The query should either add ORDER BY for deterministic results or add a runtime check to ensure exactly one dependency exists. This could cause map tasks to receive elements from random dependencies in multi-dependency scenarios.

Suggested change
(SELECT jsonb_array_element(value, st.task_index)
FROM jsonb_each(dep_out.deps_output)
LIMIT 1)
(SELECT jsonb_array_element(value, st.task_index)
FROM jsonb_each(dep_out.deps_output)
WHERE (SELECT COUNT(*) FROM jsonb_each(dep_out.deps_output)) = 1
LIMIT 1)

Spotted by Diamond

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

…handling migration

- Updated start_tasks function to build step input conditionally based on step type
- Implemented logic for root and dependent map steps to extract array elements
- Added a new migration script to handle array elements in start_tasks
- Included comprehensive tests for dependent map element extraction, large array processing,
mixed JSON types, and nested arrays
- Improved handling of array-based tasks and input construction for various step configurations
@jumski jumski force-pushed the 09-16-handle-arrays-in-start-tasks branch from 419de1b to 781e413 Compare September 17, 2025 06:31
-- Extract the element at task_index from the run's input array.
-- Note: If run input is not an array, this will return NULL
-- and the flow will fail (validated in start_flow).
jsonb_array_element(r.input, st.task_index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array bounds vulnerability - jsonb_array_element(r.input, st.task_index) will return NULL if task_index is out of bounds, but this NULL is passed directly as task input without validation. If there's a race condition or bug in task creation that results in task_index >= array_length, tasks will receive NULL input and likely fail at runtime. Add bounds checking or handle NULL case explicitly.

Suggested change
jsonb_array_element(r.input, st.task_index)
CASE
WHEN jsonb_array_length(r.input) > st.task_index THEN jsonb_array_element(r.input, st.task_index)
ELSE jsonb_build_object('error', 'Task index out of bounds')
END

Spotted by Diamond

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Copy link
Contributor

🔍 Preview Deployment: Website

Deployment successful!

🔗 Preview URL: https://pr-216.pgflow.pages.dev

📝 Details:

  • Branch: 09-16-handle-arrays-in-start-tasks
  • Commit: a6847c6feda27314a9909b6a65c51fd8a0448eed
  • View Logs

_Last updated: _

Copy link
Contributor

🔍 Preview Deployment: Playground

Deployment successful!

🔗 Preview URL: https://pr-216--pgflow-demo.netlify.app

📝 Details:

  • Branch: 09-16-handle-arrays-in-start-tasks
  • Commit: a6847c6feda27314a9909b6a65c51fd8a0448eed
  • View Logs

_Last updated: _

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant