Skip to content

feat: support nesting task runs into each other#1126

Merged
leonardmq merged 12 commits intoleonard/kil-447-feat-stream-multiturn-ai-sdk-openai-protocolsfrom
leonard/kil-461-feat-nesting-task-runs-into-each-other
Mar 17, 2026
Merged

feat: support nesting task runs into each other#1126
leonardmq merged 12 commits intoleonard/kil-447-feat-stream-multiturn-ai-sdk-openai-protocolsfrom
leonard/kil-461-feat-nesting-task-runs-into-each-other

Conversation

@leonardmq
Copy link
Collaborator

@leonardmq leonardmq commented Mar 13, 2026

What does this PR do?

PR going into #1107 (multiturn + streaming)

Allow nesting TaskRun into each other. This allows forking and each step in the conversation to be a complete and immutable snapshot with its own ID - avoids headaches with concurrent writes and so on.

Our current modeling only allows for one parent type per model, so this required refactoring and dropping a check on the BaseModel - if a parent is of the wrong type up the tree, we won't find out, but it will break elsewhere anyway, so seems like a reasonable tradeoff to make.

Also added helper query wrappers on the Task and TaskRun - since now given a TaskRun->id, we will need to DFS over all the descendants of the Task - unless we now any one of its ancestors' ID (in which case we can query starting with that ancestor as root for our query).

Checklists

  • Tests have been run locally and passed
  • New tests have been added to any work in /lib

Summary by CodeRabbit

  • New Features

    • TaskRuns can be nested under Tasks or other TaskRuns with parent navigation, child listing, root detection, and DFS search by ID.
    • Loading and iteration of children now handle polymorphic parents (Task or TaskRun) with clearer validation and error messages.
  • Documentation

    • Expanded API docs describing nested TaskRun structures, storage layout, and polymorphic parenting.
  • Tests

    • Extensive tests added for nested hierarchies, traversal, DFS search, loading, validation, and edge cases.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to the data model by allowing TaskRun instances to be nested. This change provides greater flexibility in modeling complex workflows and conversations, where each step can be an independent, immutable TaskRun. The core data model was adapted to support polymorphic parent relationships, and new utility functions were added to navigate these nested structures efficiently. The update ensures that the system can handle more intricate execution flows while maintaining data integrity.

Highlights

  • Nested Task Runs: Enabled TaskRun objects to be nested within other TaskRun objects, allowing for a hierarchical structure of task executions. This supports immutable snapshots for each step in a conversation, addressing concurrent write concerns.
  • Polymorphic Parent Support: Refactored the BaseModel to support polymorphic parent types, specifically allowing TaskRun to accept both Task and other TaskRun instances as parents. This involved modifying parent type validation logic.
  • DFS Query Helpers: Introduced Depth-First Search (DFS) helper methods (find_task_run_by_id_dfs) on both Task and TaskRun models. These methods facilitate efficient searching for specific TaskRun instances within the potentially deeply nested run hierarchy.
  • API Schema Updates: Updated the API schema documentation to reflect the new capabilities of TaskRun nesting and its polymorphic parent relationships.
  • Comprehensive Testing: Added extensive new tests to validate the nested TaskRun folder structure, parent/child relationships, and the functionality of the new DFS search methods.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • app/web_ui/src/lib/api_schema.d.ts
    • Updated comments for 'TaskRun-Input' and 'TaskRun-Output' to describe nesting capabilities and polymorphic parent acceptance.
  • libs/core/kiln_ai/datamodel/basemodel.py
    • Refactored check_parent_type into a private _check_parent_type method to allow subclasses to specify multiple valid parent types.
    • Modified iterate_children_paths_of_parent_path to remove direct parent type validation, accommodating polymorphic parent models.
  • libs/core/kiln_ai/datamodel/task.py
    • Added find_task_run_by_id_dfs method to perform a Depth-First Search for a TaskRun within the task's entire run tree.
  • libs/core/kiln_ai/datamodel/task_run.py
    • Imported Optional, KilnBaseModel, and KilnParentModel for enhanced type hinting and polymorphic behavior.
    • Modified TaskRun class definition to inherit from KilnParentModel and specified parent_of={}.
    • Added parent_task method to retrieve the root Task for a potentially nested TaskRun.
    • Added parent_run method to get the immediate TaskRun parent if nested.
    • Added is_root_task_run method to determine if a TaskRun is at the top level of its hierarchy.
    • Added find_task_run_by_id_dfs method for searching child TaskRun instances within its own subtree.
    • Overrode load_parent to support loading either a Task or another TaskRun as its parent.
    • Overrode check_parent_type to validate that the parent is either a Task or a TaskRun.
    • Dynamically wired TaskRun as its own child type for the 'runs' relationship, enabling self-nesting.
  • libs/core/kiln_ai/datamodel/test_models.py
    • Imported Union for type hinting in new test functions.
    • Added a task pytest fixture for consistent test setup.
    • Introduced test_nested_task_run_folder_structure to verify the file system layout of nested runs.
    • Added test_nested_task_runs_multiple_levels to confirm correct parent/child relationships across several nesting depths.
    • Included test_parent_task_deeply_nested_task_run to ensure the correct root Task is identified for deeply nested runs.
    • Added tests for find_nested_task_run_by_id_given_parent_run and find_root_task_run_by_id_given_task.
    • Implemented test_find_deeply_nested_find_task_run_by_id_dfs to test the DFS search from a Task.
    • Added test_find_task_run_by_id_dfs_finds_deeply_nested_run and test_find_task_run_by_id_dfs_finds_root_run for Task's DFS method.
    • Included test_find_task_run_by_id_dfs_returns_none_when_not_found for Task's DFS method.
    • Added test_task_run_find_task_run_by_id_dfs_finds_deeply_nested_run and test_task_run_find_task_run_by_id_dfs_finds_direct_child for TaskRun's DFS method.
    • Included test_task_run_find_task_run_by_id_dfs_returns_none_when_not_found for TaskRun's DFS method.
    • Added test_is_root_task_run to verify the is_root_task_run method.
Activity
  • The author, leonardmq, created the pull request with a clear description of the feature and its implications.
  • The author included a checklist indicating that tests have been run locally and new tests have been added for the /lib directory.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds polymorphic, nested TaskRun support: TaskRuns may be children of Tasks or other TaskRuns. Introduces parent-type validators and polymorphic parent loading, DFS search over nested runs, child-path iteration that inspects on-disk model_type, documentation comments, and extensive tests covering new behaviors.

Changes

Cohort / File(s) Summary
API schema docs
app/web_ui/src/lib/api_schema.d.ts
Added clarifying comments to TaskRun-Input / TaskRun-Output about nesting under another TaskRun, storage in a runs subfolder, and polymorphic parent support (Task or TaskRun).
Base model / parent validation
libs/core/kiln_ai/datamodel/basemodel.py
Added _parent_types(), _check_parent_type() and @model_validator wrapper to validate polymorphic parents; iterate_children_paths_of_parent_path now verifies parent path exists and inspects on-disk model_type when multiple parent types are allowed.
TaskRun datamodel
libs/core/kiln_ai/datamodel/task_run.py
Made TaskRun polymorphic-parent aware (inherits KilnParentedModel, KilnParentModel, parent_of wiring). Added _parent_types() returning [Task, TaskRun], load_parent(), parent_task(), parent_run(), runs(readonly), is_root_task_run(), check_parent_type() validator, DFS search, and child-collection wiring for runs.
Task datamodel
libs/core/kiln_ai/datamodel/task.py
Added find_task_run_by_id_dfs() to traverse TaskRun subtrees via DFS starting from self.runs(readonly=...).
Tests: datamodel
libs/core/kiln_ai/datamodel/test_models.py, libs/core/kiln_ai/datamodel/test_basemodel.py, libs/core/kiln_ai/datamodel/test_task.py
Added extensive tests for polymorphic parent scenarios, nested TaskRun hierarchies, parent/child resolution, DFS discovery, on-disk model_type validation, error cases, and persistence across load/save.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Task
    participant ParentRun as TaskRun(parent)
    participant FS as FileSystem
    participant ChildRun as TaskRun(child)

    Client->>Task: find_task_run_by_id_dfs(id)
    Task->>Task: iterate self.runs(readonly)
    Task->>ParentRun: check run.id
    alt id matches
        ParentRun-->>Task: return TaskRun
    else not matched
        ParentRun->>FS: list/load child runs from runs/ subfolder
        FS-->>ChildRun: return child TaskRun data
        ChildRun->>ChildRun: find_task_run_by_id_dfs(id)
        ChildRun-->>ParentRun: return match or None
        ParentRun-->>Task: propagate result
    end
    Task-->>Client: return TaskRun | None
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • sfierro
  • scosman
  • chiang-daniel

Poem

🐰 In folders deep where run-folders hide,
A rabbit hops through parents wide.
Task or Run — whichever's true,
DFS traces paths anew.
Tests and tails, we bounce with pride.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 72.13% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: support nesting task runs into each other' directly and clearly describes the main feature being added—enabling TaskRun nesting—which is the primary purpose of the PR.
Description check ✅ Passed The PR description clearly explains the goal (enabling TaskRun nesting), the rationale, and implementation approach. It includes completed checklists confirming tests passed and new tests added.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch leonard/kil-461-feat-nesting-task-runs-into-each-other
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Mar 13, 2026

📊 Coverage Report

Overall Coverage: 91%

Diff: origin/leonard/kil-447-feat-stream-multiturn-ai-sdk-openai-protocols...HEAD

  • libs/core/kiln_ai/adapters/model_adapters/base_adapter.py (100%)
  • libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py (100%)
  • libs/core/kiln_ai/datamodel/basemodel.py (92.6%): Missing lines 649,668
  • libs/core/kiln_ai/datamodel/task.py (100%)
  • libs/core/kiln_ai/datamodel/task_run.py (95.2%): Missing lines 163,184,236

Summary

  • Total: 107 lines
  • Missing: 5 lines
  • Coverage: 95%

Line-by-line

View line-by-line diff coverage

libs/core/kiln_ai/datamodel/basemodel.py

Lines 645-653

  645         if parent_types_override is None:
  646             # Default: single expected parent type — original behaviour
  647             parent = cls.parent_type().load_from_file(parent_path)
  648             if parent is None:
! 649                 raise ValueError("Parent must be set to load children")
  650         else:
  651             # Polymorphic parent: read only the model_type field to avoid a full load.
  652             with open(parent_path, "r", encoding="utf-8") as fh:
  653                 actual_parent_type_name = json.loads(fh.read()).get("model_type", "")

Lines 664-672

  664                 if t.type_name() == actual_parent_type_name
  665             )
  666             parent = parent_type.load_from_file(parent_path)
  667             if parent is None:
! 668                 raise ValueError("Parent must be set to load children")
  669 
  670         # Ignore type error: this is abstract base class, but children must implement relationship_name
  671         relationship_folder = parent_folder / Path(cls.relationship_name())  # type: ignore

libs/core/kiln_ai/datamodel/task_run.py

Lines 159-167

  159 
  160             if not isinstance(parent, TaskRun):
  161                 # the parent is not a TaskRun, but also not a Task, so it is not
  162                 # a real parent
! 163                 return None
  164 
  165             # the parent is a TaskRun, so we just walk up the tree until we find a Task
  166             current = parent

Lines 180-188

  180         return [Task, TaskRun]
  181 
  182     def runs(self, readonly: bool = False) -> list["TaskRun"]:
  183         """The list of child task runs."""
! 184         return super().runs(readonly=readonly)  # type: ignore
  185 
  186     def is_root_task_run(self) -> bool:
  187         """Is this the root task run? (not nested under another task run)"""
  188         # lazy import to avoid circular dependency

Lines 232-240

  232             loaded_parent_task = Task.load_from_file(task_path)
  233             super().__setattr__("parent", loaded_parent_task)
  234             return loaded_parent_task
  235 
! 236         return None
  237 
  238     @model_validator(mode="after")
  239     def check_parent_type(self) -> Self:
  240         """Check that the parent is a Task or TaskRun. This overrides the default parent type check


Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant feature by enabling the nesting of TaskRun objects, which is crucial for modeling conversations or multi-step tasks. The implementation is robust, with thoughtful changes to the data models, including support for polymorphic parents in KilnBaseModel. The addition of DFS search methods on Task and TaskRun is a valuable utility for navigating the new nested structure, and the test coverage is comprehensive. I have a few suggestions to enhance maintainability and robustness, primarily concerning code duplication and a recursive implementation that could be made iterative. Overall, this is an excellent and well-executed pull request.


@model_validator(mode="after")
def check_parent_type(self) -> Self:
def _check_parent_type(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just wrapping the internal of the default check_parent_type to allow calling it from the subclass (TaskRun) and passing the multiple parent types we allow.

return self

# some models support having multiple parent types, so we allow overriding the expected parent
if expected_parent_types is not None:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if branch is the new part - we can override the default check_parent_type from the subclass and pass in the types we allow (in the case of TaskRun, we allow Task and TaskRun as valid parents)

return self

@model_validator(mode="after")
def check_parent_type(self) -> Self:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backward compatible - no change there in logic



class TaskRun(KilnParentedModel):
class TaskRun(KilnParentedModel, KilnParentModel, parent_of={}):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parent_of is set further down procedurally (cannot declare it here because it references itself)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libs/core/kiln_ai/datamodel/basemodel.py`:
- Around line 626-629: Restore the parent-type guard for non-polymorphic child
lookups by reinstating a check that verifies the provided parent is an allowed
type before proceeding (do not rely solely on parent_path.exists()). Add an
overridable hook on KilnParentedModel (e.g., accepted_parent_types or
is_accepted_parent(parent_type)) and use it in all_children_of_parent_path() and
from_id_and_parent_path() to enforce the original defensive validation for
typical models while allowing TaskRun (or other polymorphic cases) to override
the hook to accept multiple parent types; if the parent type is not accepted,
raise the same ValueError("Parent must be set to load children").

In `@libs/core/kiln_ai/datamodel/task_run.py`:
- Around line 198-205: The code currently swallows a ValueError when loading a
sibling task_run file, causing malformed nested runs to appear as root runs;
modify the block that calls TaskRun.load_from_file (referencing
TaskRun.base_filename(), TaskRun.load_from_file, and the "parent" attribute
assignment) to not silently pass on exceptions—instead re-raise or wrap the
caught ValueError with context (e.g., include the problematic task_run_path) so
the failure surfaces immediately; this ensures parent_task(), parent_run(), and
is_root_task_run() cannot mistakenly treat a broken parent as missing.
- Around line 201-213: The lazy parent-loading in TaskRun.parent (during
TaskRun.load_from_file) ignores the readonly flag and always constructs mutable
ancestors; update the loading calls to propagate the original readonly intent by
passing readonly=True when the current TaskRun was loaded readonly (e.g.,
forward the readonly parameter into TaskRun.load_from_file and
Task.load_from_file calls inside the parent resolution path), so
parent_run()/parent_task() return immutable instances and participate in the
shared readonly cache (refer to TaskRun.load_from_file, the parent attribute
assignment, and parent_run()/parent_task() resolution logic).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 406f488e-1721-49ee-a144-b0706b755b64

📥 Commits

Reviewing files that changed from the base of the PR and between c516dca and 89133f6.

📒 Files selected for processing (5)
  • app/web_ui/src/lib/api_schema.d.ts
  • libs/core/kiln_ai/datamodel/basemodel.py
  • libs/core/kiln_ai/datamodel/task.py
  • libs/core/kiln_ai/datamodel/task_run.py
  • libs/core/kiln_ai/datamodel/test_models.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
libs/core/kiln_ai/datamodel/test_models.py (1)

792-976: Consider extracting a helper for repeated nested-run setup.

Several tests recreate the same 2–3 level run chain. A small factory/helper would reduce duplication and make intent clearer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/core/kiln_ai/datamodel/test_models.py` around lines 792 - 976, Tests
like test_nested_task_runs_multiple_levels,
test_parent_task_deeply_nested_task_run,
test_find_task_run_by_id_dfs_finds_deeply_nested_run, and others repeatedly
build 2–3 level TaskRun chains with TaskRun(..., parent=...) and save_to_file();
extract a small helper (e.g., create_nested_runs or make_run_chain) that accepts
the root Task, a list of input values (or depth), creates the chain using
TaskRun and save_to_file, returns the created runs (or final run), and then
replace the duplicated setup code in those tests to call the helper and assert
against its results (use TaskRun, save_to_file, .id, .path, and relevant test
assertions unchanged).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libs/core/kiln_ai/datamodel/test_models.py`:
- Line 3: The import "Union" from typing in test_models.py is unused and causing
a Ruff F401 lint failure; remove "Union" from the import statement (the line
"from typing import Union") or replace the import with only the actually used
names so that the module no longer imports the unused symbol.

---

Nitpick comments:
In `@libs/core/kiln_ai/datamodel/test_models.py`:
- Around line 792-976: Tests like test_nested_task_runs_multiple_levels,
test_parent_task_deeply_nested_task_run,
test_find_task_run_by_id_dfs_finds_deeply_nested_run, and others repeatedly
build 2–3 level TaskRun chains with TaskRun(..., parent=...) and save_to_file();
extract a small helper (e.g., create_nested_runs or make_run_chain) that accepts
the root Task, a list of input values (or depth), creates the chain using
TaskRun and save_to_file, returns the created runs (or final run), and then
replace the duplicated setup code in those tests to call the helper and assert
against its results (use TaskRun, save_to_file, .id, .path, and relevant test
assertions unchanged).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d68b321f-be31-4161-ab8a-81bbe4afe32c

📥 Commits

Reviewing files that changed from the base of the PR and between 89133f6 and dc74f47.

📒 Files selected for processing (1)
  • libs/core/kiln_ai/datamodel/test_models.py

@leonardmq leonardmq marked this pull request as draft March 13, 2026 17:35
@leonardmq leonardmq marked this pull request as ready for review March 16, 2026 17:17
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
libs/core/kiln_ai/datamodel/task_run.py (1)

180-182: runs() is defined twice; the later dynamic wiring overwrites the class method.

This creates dead code/confusion around which implementation is active. Keep one source of truth for runs().

Also applies to: 369-370

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/core/kiln_ai/datamodel/task_run.py` around lines 180 - 182, There are
two definitions for runs() causing the later dynamic wiring to overwrite the
class method; pick a single source of truth by removing one duplicate. Either
keep the static method signature def runs(self, readonly: bool = False) ->
list["TaskRun"] and delete the later dynamic wiring that reassigns/overwrites
runs, or remove the static definition and ensure the dynamic wiring provides the
full implementation (including type hints and docstring) so only one runs
implementation remains; update references to TaskRun.runs accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libs/core/kiln_ai/datamodel/task_run.py`:
- Around line 188-201: The DFS in find_task_run_by_id_dfs misses the current
node because it seeds stack with list(self.runs(...)) only; update the traversal
to include self in the search (respecting the readonly argument) by either
checking if self.id == task_run_id before the loop or seeding stack with [self]
+ list(self.runs(readonly=readonly)) so the method correctly searches the entire
task run tree starting at this node.
- Around line 166-171: parent_run() currently only checks cached_parent() and
thus can miss a lazily-loadable parent; change it to call the loader (use
self.parent()) so the parent is loaded if necessary, then validate its type
(isinstance(..., TaskRun)) and return it or None; update the logic in the
parent_run method to prefer self.parent() over self.cached_parent() while
keeping the same return behavior.

---

Nitpick comments:
In `@libs/core/kiln_ai/datamodel/task_run.py`:
- Around line 180-182: There are two definitions for runs() causing the later
dynamic wiring to overwrite the class method; pick a single source of truth by
removing one duplicate. Either keep the static method signature def runs(self,
readonly: bool = False) -> list["TaskRun"] and delete the later dynamic wiring
that reassigns/overwrites runs, or remove the static definition and ensure the
dynamic wiring provides the full implementation (including type hints and
docstring) so only one runs implementation remains; update references to
TaskRun.runs accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7b207c8d-a0de-4ef2-a996-ed571bb0b889

📥 Commits

Reviewing files that changed from the base of the PR and between 243da49 and 1accfe0.

📒 Files selected for processing (1)
  • libs/core/kiln_ai/datamodel/task_run.py

return schema_from_json_str(self.input_json_schema, require_object=False)

# These wrappers help for typechecking. We should fix this in KilnParentModel
def runs(self, readonly: bool = False) -> list[TaskRun]:
Copy link
Contributor

@chiang-daniel chiang-daniel Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the nested runs change the meaning of task.runs() here? Does this API need to traverse all the nested runs?

This scans only the runs under the task’s runs/ folder and skips the nested runs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not change the meaning - it works the same, finds the children of this task runs (not all the descendants). And there are utils to DFS to a specific TaskRun ID given the root TaskRun ID or the Task ID.

The point of nested TaskRun is to allow saving an entire snapshot of a conversation - where each turn produces a new snapshot of the conversation (the path to it is the path from the first message through all the descendant TaskRuns part of this conversation). Each TaskRun in the tree has the entire trace.


return [Task, TaskRun]

def runs(self, readonly: bool = False) -> list["TaskRun"]:
Copy link
Contributor

@chiang-daniel chiang-daniel Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future we will have a structure like task/runs/<run1>/runs/<run2>/runs/<run3>/task_run.kiln and some of the APIs still assumes there is only "1 level" of runs.

run_api.py:

  1. get_runs()
  2. get_runs_summary()...etc

dataset_split.py

  1. _get_runs()...etc

In general do we have to worry about those? What about usages in eval / fine-tune...etc or do we assume at those call paths we only care about "immediate children" and not all the decedents?

Copy link
Collaborator Author

@leonardmq leonardmq Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We most likely won't have the hierarchy surfacing in the API.

Each TaskRun is a full self-contained snapshot / version of the conversation at some point in time - it does not need to know its ancestry path to be used as it has the full trace. We will likely only ever need to find a TaskRun by its ID. For that we have the method to find_task_run_by_id_dfs.

Collecting all descendant runs of the whole subtree is trivial tree traversal if we ever need it. We can have that in a helper if we ever need to do that, but it does not need to replace the method for getting the children.

And this is only used for multiturn - we don't support a lot of downstream use cases for multiturn at the moment. Eval, finetune, UI, is up in the air. This is at the moment mostly for persistence in a multiturn scenario using Kiln as an SDK.

Once we know what the UI will look like for multiturn, we will probably do another pass to see where we want to surface the whole tree, where we want only the leaves or only the roots

return None

# the parent is a TaskRun, so we just walk up the tree until we find a Task
current = parent # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we avoid # type: ignore by checking directly with TaskRun class type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was using the class name for consistency with how we do it in the other datamodels, but isinstance works too - swapped them out here: e26e52d


def is_root_task_run(self) -> bool:
"""Is this the root task run? (not nested under another task run)"""
return self.parent is None or self.parent.__class__.__name__ == "Task"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use self.cached_parent() otherwise it might not be valid until load_parent is called?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.parent calls the get_attributes override (which overrides the dot syntax accessor) on the KilnBaseModel (the class this inherits from).

That override lazy loads the parent from disk if it is not cached

@leonardmq leonardmq merged commit f9af9d1 into leonard/kil-447-feat-stream-multiturn-ai-sdk-openai-protocols Mar 17, 2026
10 checks passed
@leonardmq leonardmq deleted the leonard/kil-461-feat-nesting-task-runs-into-each-other branch March 17, 2026 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants