refactor: support multiturn chat by leonardmq · Pull Request #1088 · Kiln-AI/Kiln

leonardmq · 2026-02-26T11:19:33Z

What does this PR do?

Issue: we currently only support single turn chars (system->user->assistant - and a couple of slight variants we internally implement like two message CoT).

This PR adds support for multiturn conversation, letting the history of conversation accumulate. It currently works as follows:

create a run as usual; this produces a TaskRun, that contains the trace, and the output and intermediate_outputs are extracted from the latest message (coming from the assistant); no change there
to continue a conversation, we can call the same method but pass in an existing TaskRun->id; we internally load the existing TaskRun->trace, create a new run (appending the new user message to the existing trace), and replace the output and intermediate_outputs with the new latest message

This is fine for SDK usage to just do inference; but this opens up downstream side-effects on our app:

TaskRun are no longer immutable; a single turn run may turn into a multiturn run as a result of the user adding new messages into the run; an existing multiturn run may get new messages appended to it; thereby changing the latest message
Evals are dependent on TaskRun, but if the task runs change, evals cannot be compared to previous runs; or at least we cannot guarantee that through the modeling
etc.

We will need to implement restrictions on the UI / user side to prevent users from causing downstream side effects we do not want.

Checklists

Tests have been run locally and passed
New tests have been added to any work in /lib

Summary by CodeRabbit

Release Notes

New Features
- Session continuation: Tasks can be resumed from previous runs with full conversation history preserved, enabling users to extend multi-turn interactions and build upon prior exchanges without losing context.
Tests
- Added comprehensive test coverage for session continuation and multi-turn conversation formatting.
Limitations
- MCP adapter: Does not support session continuation and maintains single-turn behavior.

gemini-code-assist · 2026-02-26T11:20:00Z

Summary of Changes

Hello @leonardmq, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the AI system's conversational capabilities by enabling multi-turn interactions. It refactors the core logic to allow existing TaskRun instances to be updated with new messages, effectively maintaining conversation history. This change introduces a specialized formatter for managing the flow of multi-turn dialogues and ensures that certain adapters, like the MCP adapter, correctly handle their single-turn nature. The overall impact is a more dynamic and stateful conversational experience for users.

Highlights

Multi-turn Chat Support: Introduced the ability to continue existing chat sessions by passing a task_run_id, allowing new user messages to be appended to a run's trace.
Mutable TaskRun Objects: Modified TaskRun objects to be mutable, enabling updates to their output, trace, usage, and intermediate outputs when a session is continued.
MultiturnFormatter: Added a new MultiturnFormatter class to handle the formatting of messages for multi-turn conversations, seeding new turns with the prior conversation trace.
MCP Adapter Limitations: Explicitly marked the MCP (Multi-Component Pipeline) adapter as not supporting multi-turn conversations, raising NotImplementedError if a task_run_id or prior_trace is provided.
API Schema Update: Updated the API schema to include an optional task_run_id field in the RunTaskRequest for initiating multi-turn interactions.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

app/web_ui/src/lib/api_schema.d.ts
- Added task_run_id property to the components.schemas.RunTaskRequest interface, allowing clients to specify an existing run to continue a session.
libs/core/kiln_ai/adapters/chat/init.py
- Imported MultiturnFormatter.
- Exported MultiturnFormatter in __all__.
libs/core/kiln_ai/adapters/chat/chat_formatter.py
- Imported ChatCompletionMessageParam type.
- Added initial_messages abstract method to ChatFormatter base class.
- Implemented MultiturnFormatter class to handle multi-turn conversation logic, initializing with a prior_trace and appending new user input.
libs/core/kiln_ai/adapters/chat/test_chat_formatter.py
- Imported MultiturnFormatter for testing.
- Added test_multiturn_formatter_initial_messages to verify initial message seeding.
- Added test_multiturn_formatter_next_turn to test the multi-turn message progression.
libs/core/kiln_ai/adapters/model_adapters/base_adapter.py
- Imported MultiturnFormatter and ID_TYPE.
- Modified invoke and _run_returning_run_output methods to accept an optional task_run_id parameter.
- Added logic to load an existing TaskRun based on task_run_id and retrieve its trace.
- Updated _run abstract method signature to include an optional prior_trace parameter.
- Modified build_chat_formatter to return a MultiturnFormatter if prior_trace is provided.
- Refactored generate_run to accept an optional existing_run and merge new output, usage, and intermediate outputs into it if present, otherwise creating a new TaskRun.
libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py
- Updated _run method signature to accept an optional prior_trace parameter.
- Modified _run to use chat_formatter.initial_messages() to seed the conversation, which will include the prior_trace if provided.
libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py
- Updated _run, invoke, and invoke_returning_run_output method signatures to accept optional prior_trace or task_run_id parameters.
- Added NotImplementedError checks in _run, invoke, and invoke_returning_run_output to prevent multi-turn conversations with the MCP adapter.
libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py
- Updated _run method signature in MockAdapter to accept prior_trace.
- Updated mock_run function signatures in various tests to accept prior_trace.
- Added test_build_chat_formatter_with_prior_trace_returns_multiturn_formatter to verify correct formatter selection.
- Added test_task_run_id_task_path_none_raises to ensure error handling for missing task path.
- Added test_invoke_returning_run_output_passes_task_run_id_to_run to confirm task_run_id is passed correctly.
libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.py
- Imported ModelTurnResult for testing.
- Added test_run_with_prior_trace_uses_multiturn_formatter to verify multi-turn behavior in LiteLLM adapter.
libs/core/kiln_ai/adapters/model_adapters/test_mcp_adapter.py
- Added test_mcp_adapter_rejects_multiturn_invoke_returning_run_output to confirm multi-turn rejection.
- Added test_mcp_adapter_rejects_multiturn_invoke to confirm multi-turn rejection.
- Added test_mcp_adapter_rejects_prior_trace_in_run to confirm multi-turn rejection.
libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py
- Imported MagicMock for testing.
- Updated _run method signature in MockAdapter to accept prior_trace.
- Added test_invoke_continue_session to verify session continuation and run updates.
- Added test_invoke_continue_invalid_task_run_id for invalid run ID handling.
- Added test_invoke_continue_run_without_trace for runs missing trace history.
- Added test_generate_run_with_existing_run_merges_usage_and_intermediate_outputs to test data merging.
libs/core/kiln_ai/adapters/model_adapters/test_structured_output.py
- Updated _run method signature in MockAdapter to accept prior_trace.
libs/core/kiln_ai/adapters/test_prompt_builders.py
- Updated _run method signature in MockAdapter to accept prior_trace.
libs/core/kiln_ai/datamodel/test_basemodel.py
- Updated _run method signature in MockAdapter to accept prior_trace.
libs/server/kiln_server/run_api.py
- Added task_run_id field to RunTaskRequest Pydantic model.
- Modified run_task endpoint to pass request.task_run_id to adapter.invoke.
libs/server/kiln_server/test_run_api.py
- Imported os, Path, KilnBuiltInToolId, and Config for new tests and fixtures.
- Added test_run_task_with_task_run_id_continues_session to verify API endpoint passes task_run_id for session continuation.
- Added adapter_sanity_check_setup and adapter_sanity_check_math_tools_setup fixtures for integration testing.
- Added _assert_math_tools_response helper function for math tool test assertions.
- Added test_run_task_adapter_sanity_math_tools for multi-turn math tool interaction with gpt_4o_mini.

Activity

Tests have been run locally and passed.
New tests have been added to cover the multi-turn chat functionality in /lib.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-02-26T11:20:01Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

This PR implements multi-turn conversation continuation by introducing a MultiturnFormatter class that resumes from prior message traces. The base adapter now accepts existing_run parameters to retrieve prior conversation history, passing it through the chat formatting pipeline, while model-specific adapters (LiteLLM supports continuation; MCP rejects it) handle the resumed trace appropriately.

Changes

Cohort / File(s)	Summary
Chat Formatter Core `libs/core/kiln_ai/adapters/chat/__init__.py`, `libs/core/kiln_ai/adapters/chat/chat_formatter.py`	Added `ChatCompletionMessageIncludingLiteLLM` type alias and new `MultiturnFormatter` class that accepts prior traces and resumes conversation from previous messages. Added `initial_messages()` method to base `ChatFormatter` class. Exported both new entities as public API.
BaseAdapter Enhancement `libs/core/kiln_ai/adapters/model_adapters/base_adapter.py`	Extended `invoke`, `_run_returning_run_output`, and `invoke_returning_run_output` with optional `existing_run` parameter. Added `prior_trace` parameter to `_run` and `build_chat_formatter`. Modified `generate_run` to merge outputs and usage when continuing existing runs, extracting prior trace from existing run's message history.
LiteLLMAdapter Continuation Support `libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py`	Added `prior_trace` parameter to `_run` method. Changed initial messages initialization to use `chat_formatter.initial_messages()` instead of empty list, enabling continuation of prior traces through the same model turn loop.
MCPAdapter Single-Turn Enforcement `libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py`	Added `prior_trace` and `existing_run` parameters to `_run`, `invoke`, and `invoke_returning_run_output` methods; raises `NotImplementedError` when these are provided to enforce single-turn-only behavior.
Formatter Tests `libs/core/kiln_ai/adapters/chat/test_chat_formatter.py`	Added tests for `MultiturnFormatter`: validates `initial_messages` returns prior trace and `next_turn` handles first turn with final call and returns `None` on subsequent turns.
BaseAdapter Tests `libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py`	Extended `MockAdapter._run` signature with `prior_trace` parameter. Added tests validating `MultiturnFormatter` creation with prior trace, continuation error handling (missing trace), and prior trace threading through run invocations.
LiteLLMAdapter & MCP Tests `libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.py`, `libs/core/kiln_ai/adapters/model_adapters/test_mcp_adapter.py`	Added test for multi-turn formatter usage with prior trace in LiteLLM. Added tests validating MCP adapter rejects continuation attempts via `NotImplementedError`.
Session Continuation Tests `libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py`	Added tests for session continuation: validating trace updates, output merging, run reference preservation, and error handling when continuing without trace history.
Remaining Model Tests `libs/core/kiln_ai/adapters/model_adapters/test_structured_output.py`, `libs/core/kiln_ai/adapters/test_prompt_builders.py`, `libs/core/kiln_ai/datamodel/test_basemodel.py`	Updated `MockAdapter._run` signature across test files to accept optional `prior_trace` parameter for consistency with new base adapter protocol.
Server Integration Test `libs/server/kiln_server/test_run_api.py`	Added sanity test for math tools via OpenRouter with trace extension support; includes helper fixtures and assertion utilities for end-to-end adapter validation.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant BaseAdapter
    participant ChatFormatter
    participant LiteLLMAdapter
    participant LLM

    User->>BaseAdapter: invoke(input, existing_run)
    Note over BaseAdapter: Detect prior_trace from<br/>existing_run.trace
    BaseAdapter->>ChatFormatter: build_chat_formatter(input, prior_trace)
    alt prior_trace provided
        ChatFormatter-->>ChatFormatter: Create MultiturnFormatter
    else no prior_trace
        ChatFormatter-->>ChatFormatter: Create standard Formatter
    end
    ChatFormatter-->>BaseAdapter: Formatter instance
    BaseAdapter->>BaseAdapter: _run_returning_run_output(input, prior_trace)
    BaseAdapter->>LiteLLMAdapter: _run(input, prior_trace)
    LiteLLMAdapter->>LiteLLMAdapter: messages = initial_messages()
    alt prior_trace provided
        LiteLLMAdapter->>LiteLLMAdapter: messages includes prior conversation
    end
    LiteLLMAdapter->>LiteLLMAdapter: next_turn() processes new input
    LiteLLMAdapter->>LLM: Call LLM with full message history
    LLM-->>LiteLLMAdapter: Response
    LiteLLMAdapter-->>BaseAdapter: RunOutput with merged trace
    BaseAdapter->>BaseAdapter: generate_run(merged_outputs)
    BaseAdapter-->>User: TaskRun (continued session)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

add a proper trace to our datamodel following OpenAI types #509: Modifies the chat/adapter pipeline to support/propagate OpenAI-style chat traces with MultiturnFormatter and prior_trace handling through the same adapter hierarchy.
Usage tracking #308: Updates the same adapter run/usage API surface (BaseAdapter._run, generate_run, invoke flow) to thread continuation data and usage information.
Fix how we format chats: unify the code, better COT call style (no double system), better code reuse. #358: Extends the chat formatting subsystem by adding MultiturnFormatter type and exporting from the same chat module surfaces.

Suggested reviewers

scosman
sfierro

Poem

🐰 bounces with glee
Prior words are kept, no need to forget!
The traces live on, a conversation set,
Each turn remembers what came before,
MultiturnFormatter opens the door!
Hop through the context, message by message bright,
Where multi-turn chats reach their full flight! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 31.71% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'refactor: support multiturn chat' accurately and concisely summarizes the main change—adding multiturn conversation support to the chat system.
Description check	✅ Passed	The PR description provides a comprehensive explanation of what the PR does, related issues context, and completed checklists matching the template requirements.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch leonard/kil-421-adapter-support-multiturn

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (3)

libs/core/kiln_ai/adapters/model_adapters/base_adapter.py (1)
131-136: Type inconsistency: task_run_id parameter type.

In invoke() the parameter is typed as ID_TYPE | None, but in mcp_adapter.py (lines 88, 105) it's typed as str | None. For consistency and to leverage the type alias, consider using ID_TYPE throughout or verifying ID_TYPE is indeed str.
#!/bin/bash
# Verify ID_TYPE definition and usage consistency
ast-grep --pattern 'ID_TYPE = $_'
rg -n "task_run_id:" --type=py | head -20
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/core/kiln_ai/adapters/model_adapters/base_adapter.py` around lines 131 -
136, The task_run_id parameter type is inconsistent between base_adapter.invoke
(uses ID_TYPE | None) and mcp_adapter (uses str | None); update the code to use
the same type alias everywhere by replacing explicit str | None with ID_TYPE |
None (or alternatively ensure the ID_TYPE alias is defined as str and update its
definition) so invoke(), invoke_returning_run_output, and the mcp adapter
methods share a consistent type for task_run_id; locate symbols invoke,
invoke_returning_run_output, and the task_run_id annotations in mcp_adapter to
apply the change.
libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py (1)
88-88: Consider using ID_TYPE for consistency with BaseAdapter.

The task_run_id parameter is typed as str | None here but as ID_TYPE | None in base_adapter.py. For type consistency across the codebase, consider aligning with the type alias.

Also applies to: 105-105
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py` at line 88, Replace
the explicit str | None typing for the task_run_id parameter with the shared
alias ID_TYPE | None to match BaseAdapter; update each function/method in
mcp_adapter.py where the parameter task_run_id appears (both occurrences
flagged) so their signatures use ID_TYPE | None and adjust any related imports
to import ID_TYPE from the module where it is defined.
libs/server/kiln_server/test_run_api.py (1)
1725-1732: Unused helper functions — consider removing.

_adapter_sanity_check_output_path() and _append_to_sanity_check() are defined but not called anywhere in this file. If these are leftover from development, they should be removed to avoid confusion.
🗑️ Remove unused code
-def _adapter_sanity_check_output_path() -> Path:
-    return Path(__file__).resolve().parent / "adapter_sanity_check.txt"
-
-
-def _append_to_sanity_check(content: str, output_path: Path) -> None:
-    with open(output_path, "a", encoding="utf-8") as f:
-        f.write(content)
-        f.write("\n")
-
-
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/server/kiln_server/test_run_api.py` around lines 1725 - 1732, Remove the
two unused helper functions _adapter_sanity_check_output_path() and
_append_to_sanity_check() from the file: delete their definitions (the functions
named _adapter_sanity_check_output_path and _append_to_sanity_check) since they
are not referenced anywhere in the test_run_api.py module to avoid dead code and
confusion; if they are intended for future use, alternatively add a clear TODO
comment and a test that exercises them, but otherwise simply remove both
function definitions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py`:
- Around line 433-451: The test test_task_run_id_task_path_none_raises fails
because model_provider() is invoked during MockAdapter initialization and raises
an API key error; update the test to mock/stub model_provider (or the OpenAI
provider init) so it doesn't perform real API key validation before exercising
the path check. Specifically, in the test_task_run_id_task_path_none_raises
where MockAdapter is constructed with KilnAgentRunConfigProperties and
adapter.invoke is awaited, patch the model_provider function (or the
class/method used by MockAdapter to obtain the provider) to return a benign/mock
provider object so the ValueError from adapter.invoke("input",
task_run_id="some-id") (about task.path) is the raised exception as expected.

In `@libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.py`:
- Around line 1309-1353: The test
test_run_with_prior_trace_uses_multiturn_formatter calls adapter._run which
internally calls model_provider() before _run_model_turn, causing an API key
lookup; update the test to stub/mocking the adapter.model_provider method (or
attribute) to return a simple dummy provider object (e.g., a dict or a small
stub class) so no external key lookup happens, then proceed to set
adapter._run_model_turn as already done; ensure the mock is assigned on the same
adapter instance created from LiteLlmAdapter so build_chat_formatter and _run
use the stubbed provider.

In `@libs/server/kiln_server/test_run_api.py`:
- Around line 1735-1778: The adapter_sanity_check_setup fixture contains a
hardcoded user path, an unused tmp_path parameter, and appears unused; either
delete the adapter_sanity_check_setup fixture entirely (and remove its tmp_path
parameter) or refactor it to mirror adapter_sanity_check_math_tools_setup by
using tmp_path for project_path (e.g., project_path = tmp_path /
"adapter_sanity_project" / "project.kiln"), create the directory and
Project/Task there via Project(name=..., path=str(project_path)) and Task(...,
parent=project), register the temporary project into
Config.shared()._settings["projects"], yield the {"project","task"} dict, and
restore original_projects after the yield; ensure you update references to
adapter_sanity_check_setup or remove any unused fixture imports.

---

Nitpick comments:
In `@libs/core/kiln_ai/adapters/model_adapters/base_adapter.py`:
- Around line 131-136: The task_run_id parameter type is inconsistent between
base_adapter.invoke (uses ID_TYPE | None) and mcp_adapter (uses str | None);
update the code to use the same type alias everywhere by replacing explicit str
| None with ID_TYPE | None (or alternatively ensure the ID_TYPE alias is defined
as str and update its definition) so invoke(), invoke_returning_run_output, and
the mcp adapter methods share a consistent type for task_run_id; locate symbols
invoke, invoke_returning_run_output, and the task_run_id annotations in
mcp_adapter to apply the change.

In `@libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py`:
- Line 88: Replace the explicit str | None typing for the task_run_id parameter
with the shared alias ID_TYPE | None to match BaseAdapter; update each
function/method in mcp_adapter.py where the parameter task_run_id appears (both
occurrences flagged) so their signatures use ID_TYPE | None and adjust any
related imports to import ID_TYPE from the module where it is defined.

In `@libs/server/kiln_server/test_run_api.py`:
- Around line 1725-1732: Remove the two unused helper functions
_adapter_sanity_check_output_path() and _append_to_sanity_check() from the file:
delete their definitions (the functions named _adapter_sanity_check_output_path
and _append_to_sanity_check) since they are not referenced anywhere in the
test_run_api.py module to avoid dead code and confusion; if they are intended
for future use, alternatively add a clear TODO comment and a test that exercises
them, but otherwise simply remove both function definitions.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 05f9d4d and baba621.

📒 Files selected for processing (16)

app/web_ui/src/lib/api_schema.d.ts
libs/core/kiln_ai/adapters/chat/__init__.py
libs/core/kiln_ai/adapters/chat/chat_formatter.py
libs/core/kiln_ai/adapters/chat/test_chat_formatter.py
libs/core/kiln_ai/adapters/model_adapters/base_adapter.py
libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py
libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_mcp_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py
libs/core/kiln_ai/adapters/model_adapters/test_structured_output.py
libs/core/kiln_ai/adapters/test_prompt_builders.py
libs/core/kiln_ai/datamodel/test_basemodel.py
libs/server/kiln_server/run_api.py
libs/server/kiln_server/test_run_api.py

coderabbitai · 2026-02-26T11:24:07Z

libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py

+@pytest.mark.asyncio
+async def test_task_run_id_task_path_none_raises(base_project):
+    task = Task(
+        name="test_task",
+        instruction="test_instruction",
+        parent=base_project,
+    )
+    assert task.path is None
+    adapter = MockAdapter(
+        task=task,
+        run_config=KilnAgentRunConfigProperties(
+            model_name="test_model",
+            model_provider_name="openai",
+            prompt_id="simple_prompt_builder",
+            structured_output_mode="json_schema",
+        ),
+    )
+    with pytest.raises(ValueError, match="task has no path"):
+        await adapter.invoke("input", task_run_id="some-id")


⚠️ Potential issue | 🔴 Critical

Pipeline failure: Test raises wrong exception due to missing mocks.

The test expects a ValueError with message "task has no path", but the model_provider() call happens first and raises an API key error because OpenAI provider initialization isn't mocked.

🐛 Proposed fix: Mock model_provider to avoid API key check

`@pytest.mark.asyncio` async def test_task_run_id_task_path_none_raises(base_project): task = Task( name="test_task", instruction="test_instruction", parent=base_project, ) assert task.path is None adapter = MockAdapter( task=task, run_config=KilnAgentRunConfigProperties( model_name="test_model", model_provider_name="openai", prompt_id="simple_prompt_builder", structured_output_mode="json_schema", ), ) + + # Mock model_provider to avoid API key validation + provider = MagicMock() + provider.parser = None + provider.formatter = None + provider.reasoning_capable = False + adapter.model_provider = MagicMock(return_value=provider) + with pytest.raises(ValueError, match="task has no path"): await adapter.invoke("input", task_run_id="some-id")

🧰 Tools

🪛 GitHub Actions: Build and Test

[error] 450-451: Regex pattern 'task has no path' did not match. The test expected a ValueError with message 'task has no path' but OpenAI API key initialization raised a different error: 'Attempted to use OpenAI without an API key set. Get your API key from https://platform.openai.com/account/api-keys'.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py` around lines 433 - 451, The test test_task_run_id_task_path_none_raises fails because model_provider() is invoked during MockAdapter initialization and raises an API key error; update the test to mock/stub model_provider (or the OpenAI provider init) so it doesn't perform real API key validation before exercising the path check. Specifically, in the test_task_run_id_task_path_none_raises where MockAdapter is constructed with KilnAgentRunConfigProperties and adapter.invoke is awaited, patch the model_provider function (or the class/method used by MockAdapter to obtain the provider) to return a benign/mock provider object so the ValueError from adapter.invoke("input", task_run_id="some-id") (about task.path) is the raised exception as expected.

coderabbitai · 2026-02-26T11:24:07Z

libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.py

+@pytest.mark.asyncio
+async def test_run_with_prior_trace_uses_multiturn_formatter(config, mock_task):
+    prior_trace = [
+        {"role": "user", "content": "hi"},
+        {"role": "assistant", "content": "hello"},
+    ]
+    adapter = LiteLlmAdapter(config=config, kiln_task=mock_task)
+
+    build_chat_formatter_calls = []
+
+    original_build = adapter.build_chat_formatter
+
+    def capturing_build(input, prior_trace_arg=None):
+        build_chat_formatter_calls.append((input, prior_trace_arg))
+        return original_build(input, prior_trace_arg)
+
+    adapter.build_chat_formatter = capturing_build
+
+    async def mock_run_model_turn(
+        provider, prior_messages, top_logprobs, skip_response_format
+    ):
+        extended = list(prior_messages)
+        extended.append({"role": "assistant", "content": "How can I help?"})
+        return ModelTurnResult(
+            assistant_message="How can I help?",
+            all_messages=extended,
+            model_response=None,
+            model_choice=None,
+            usage=Usage(),
+        )
+
+    adapter._run_model_turn = mock_run_model_turn
+
+    run_output, _ = await adapter._run("follow-up", prior_trace=prior_trace)
+
+    assert len(build_chat_formatter_calls) == 1
+    assert build_chat_formatter_calls[0][0] == "follow-up"
+    assert build_chat_formatter_calls[0][1] == prior_trace
+
+    assert run_output.trace is not None
+    assert len(run_output.trace) == 4
+    assert run_output.trace[0]["content"] == "hi"
+    assert run_output.trace[1]["content"] == "hello"
+    assert run_output.trace[2]["content"] == "follow-up"
+    assert run_output.trace[3]["content"] == "How can I help?"


⚠️ Potential issue | 🟡 Minor

Test requires additional mocking to avoid API key lookup.

The pipeline failure indicates model_provider() is called before _run_model_turn, triggering the OpenRouter API key check. The test should mock model_provider() to bypass this.

🐛 Proposed fix to mock model_provider

`@pytest.mark.asyncio` async def test_run_with_prior_trace_uses_multiturn_formatter(config, mock_task): prior_trace = [ {"role": "user", "content": "hi"}, {"role": "assistant", "content": "hello"}, ] adapter = LiteLlmAdapter(config=config, kiln_task=mock_task) build_chat_formatter_calls = [] original_build = adapter.build_chat_formatter def capturing_build(input, prior_trace_arg=None): build_chat_formatter_calls.append((input, prior_trace_arg)) return original_build(input, prior_trace_arg) adapter.build_chat_formatter = capturing_build + # Mock model_provider to avoid API key lookup + mock_provider = Mock() + mock_provider.model_id = "test-model" + adapter.model_provider = Mock(return_value=mock_provider) + async def mock_run_model_turn( provider, prior_messages, top_logprobs, skip_response_format ):

🧰 Tools

🪛 GitHub Actions: Build and Test

[error] 1342-1342: ValueError: Attempted to use OpenRouter without an API key set. Get your API key from https://openrouter.ai/settings/keys

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.py` around lines 1309 - 1353, The test test_run_with_prior_trace_uses_multiturn_formatter calls adapter._run which internally calls model_provider() before _run_model_turn, causing an API key lookup; update the test to stub/mocking the adapter.model_provider method (or attribute) to return a simple dummy provider object (e.g., a dict or a small stub class) so no external key lookup happens, then proceed to set adapter._run_model_turn as already done; ensure the mock is assigned on the same adapter instance created from LiteLlmAdapter so build_chat_formatter and _run use the stubbed provider.

coderabbitai · 2026-02-26T11:24:07Z

libs/server/kiln_server/test_run_api.py

+@pytest.fixture
+def adapter_sanity_check_setup(tmp_path):
+    """Setup for paid adapter sanity check tests - real project/task, no adapter mocking."""
+    # if project at the path does not exist, create it, otherwise reuse
+    project_path = (
+        Path("/Users/leonardmarcq/Downloads/")
+        / "adapter_sanity_project"
+        / "project.kiln"
+    )
+    if not project_path.exists():
+        project_path.parent.mkdir()
+
+        project = Project(name="Adapter Sanity Project", path=str(project_path))
+        project.save_to_file()
+
+        task = Task(
+            name="Adapter Sanity Task",
+            instruction="You are a helpful assistant. Respond concisely.",
+            description="Task for adapter sanity checking",
+            parent=project,
+        )
+        task.save_to_file()
+
+    else:
+        project = Project.load_from_file(project_path)
+        task = next(
+            (
+                t
+                for t in project.tasks(readonly=True)
+                if t.name == "Adapter Sanity Task"
+            ),
+            None,
+        )
+        if task is None:
+            raise ValueError("Task not found")
+
+    config = Config.shared()
+    original_projects = list(config.projects) if config.projects else []
+    config._settings["projects"] = [*original_projects, str(project.path)]
+
+    yield {"project": project, "task": task}
+
+    config._settings["projects"] = original_projects
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Search for usages of adapter_sanity_check_setup fixture rg -n "adapter_sanity_check_setup" --type py

Repository: Kiln-AI/Kiln

Length of output: 142

🏁 Script executed:

# Also check the structure of the test file to understand fixture usage rg -n "def test_" libs/server/kiln_server/test_run_api.py | head -20

Repository: Kiln-AI/Kiln

Length of output: 1130

🏁 Script executed:

# Look for the math tools fixture mentioned in the comment for comparison rg -n "adapter_sanity_check_math_tools_setup" --type py

Repository: Kiln-AI/Kiln

Length of output: 458

🏁 Script executed:

# Get the math_tools fixture implementation for comparison sed -n '1781,1840p' libs/server/kiln_server/test_run_api.py

Repository: Kiln-AI/Kiln

Length of output: 2403

🏁 Script executed:

# Get the test that uses the math_tools fixture to see the pattern sed -n '1839,1860p' libs/server/kiln_server/test_run_api.py

Repository: Kiln-AI/Kiln

Length of output: 934

Remove unused fixture or refactor to use tmp_path instead of hardcoded path.

This fixture has three critical issues:

Hardcoded user-specific path — /Users/leonardmarcq/Downloads/adapter_sanity_project/project.kiln will fail on CI and other developer machines.

Fixture is unused — No test in this file calls adapter_sanity_check_setup. The similar math tools fixture (adapter_sanity_check_math_tools_setup) is actively used and properly implements the pattern with tmp_path.

Unused parameter — The tmp_path fixture parameter is accepted but never referenced.

Either remove this fixture or refactor it to use tmp_path like the working adapter_sanity_check_math_tools_setup fixture does.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@libs/server/kiln_server/test_run_api.py` around lines 1735 - 1778, The adapter_sanity_check_setup fixture contains a hardcoded user path, an unused tmp_path parameter, and appears unused; either delete the adapter_sanity_check_setup fixture entirely (and remove its tmp_path parameter) or refactor it to mirror adapter_sanity_check_math_tools_setup by using tmp_path for project_path (e.g., project_path = tmp_path / "adapter_sanity_project" / "project.kiln"), create the directory and Project/Task there via Project(name=..., path=str(project_path)) and Task(..., parent=project), register the temporary project into Config.shared()._settings["projects"], yield the {"project","task"} dict, and restore original_projects after the yield; ensure you update references to adapter_sanity_check_setup or remove any unused fixture imports.

gemini-code-assist

Code Review

This pull request introduces a significant and well-implemented feature to support multi-turn conversations. The core logic change involves passing a task_run_id to continue an existing chat session, which then mutates the TaskRun to append the new conversation turn. The changes are extensive, touching core adapters, API schemas, and tests. The implementation is robust, with good error handling for unsupported adapters and comprehensive test coverage for the new functionality. I've identified a high-severity issue in a test file related to a hardcoded path that needs to be addressed to ensure portability. Additionally, I've suggested a medium-severity improvement for type hinting to enhance code quality and maintainability. Overall, this is a solid contribution.

gemini-code-assist · 2026-02-26T11:25:29Z

libs/server/kiln_server/test_run_api.py

+    project_path = (
+        Path("/Users/leonardmarcq/Downloads/")
+        / "adapter_sanity_project"
+        / "project.kiln"
+    )


This test fixture uses a hardcoded local path (/Users/leonardmarcq/Downloads/), which will cause the test to fail on any other developer's machine or in a CI environment. Please use the tmp_path fixture provided by pytest to create temporary directories for tests, ensuring they are portable and isolated.

project_path = ( tmp_path / "adapter_sanity_project" / "project.kiln" )

gemini-code-assist · 2026-02-26T11:25:29Z

libs/core/kiln_ai/adapters/chat/chat_formatter.py

+    def initial_messages(self) -> list[Any]:
+        """Messages to seed the conversation. Empty for fresh runs; prior trace for continuation."""
+        # TODO: fix the type somehow
+        return []


The type hint for initial_messages can be more specific. Instead of list[Any], it can be list[ChatCompletionMessageParam], which is already imported in this file. This improves type safety and allows removing the TODO.

Suggested change

def initial_messages(self) -> list[Any]:

"""Messages to seed the conversation. Empty for fresh runs; prior trace for continuation."""

# TODO: fix the type somehow

return []

def initial_messages(self) -> list[ChatCompletionMessageParam]:

"""Messages to seed the conversation. Empty for fresh runs; prior trace for continuation."""

return []

gemini-code-assist · 2026-02-26T11:25:29Z

libs/core/kiln_ai/adapters/chat/chat_formatter.py

+    def initial_messages(self) -> list[Any]:
+        """Messages to seed the conversation (prior trace)."""
+        # TODO: use the type we need, but trace is untyped, and we cannot import from litellm adapter here
+        # or we get circular imports
+        return list(self._prior_trace)


The type hint for this overridden method should also be updated from list[Any] to list[ChatCompletionMessageParam] to improve type safety. The _prior_trace attribute is already typed as list[ChatCompletionMessageParam]. The comment about circular imports seems to be outdated or incorrect, as ChatCompletionMessageParam is available from kiln_ai.utils.open_ai_types.

Suggested change

def initial_messages(self) -> list[Any]:

"""Messages to seed the conversation (prior trace)."""

# TODO: use the type we need, but trace is untyped, and we cannot import from litellm adapter here

# or we get circular imports

return list(self._prior_trace)

def initial_messages(self) -> list[ChatCompletionMessageParam]:

"""Messages to seed the conversation (prior trace)."""

return list(self._prior_trace)

…run as session)

github-actions · 2026-02-26T11:47:14Z

📊 Coverage Report

Overall Coverage: 91%

Diff: origin/main...HEAD

libs/core/kiln_ai/adapters/chat/chat_formatter.py (92.0%): Missing lines 288,293
libs/core/kiln_ai/adapters/model_adapters/base_adapter.py (100%)
libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py (100%)
libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py (87.5%): Missing lines 96

Summary

Total: 64 lines
Missing: 3 lines
Coverage: 95%

Line-by-line

View line-by-line diff coverage

libs/core/kiln_ai/adapters/chat/chat_formatter.py

Lines 284-297

  284             return ChatTurn(messages=[user_msg], final_call=True)
  285 
  286         if self._state == "awaiting_final":
  287             if previous_output is None:
! 288                 raise ValueError("previous_output required for final step")
  289             self._messages.append(BasicChatMessage("assistant", previous_output))
  290             self._state = "done"
  291             return None
  292 
! 293         return None
  294 
  295 
  296 def get_chat_formatter(
  297     strategy: ChatStrategy,

libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py

Lines 92-100

   92                 "Session continuation is not supported for MCP adapter. "
   93                 "MCP tools are single-turn and do not maintain conversation state."
   94             )
   95 
!  96         run_output, _ = await self.invoke_returning_run_output(
   97             input, input_source, existing_run
   98         )
   99         return run_output

📊 HTML Coverage Report - Interactive coverage report
📈 Diff Coverage Report - Detailed diff analysis
Github Actions Run - View the full coverage report

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

libs/core/kiln_ai/adapters/model_adapters/base_adapter.py (1)

253-261: ⚠️ Potential issue | 🟠 Major

Continuation can return a non-persisted updated run with a persisted ID.

When existing_run is used, Line 259 skips ID clearing, but the save is skipped if autosave/allow_saving conditions fail. That returns a mutated run object that looks persisted while disk state remains stale. Next continuation can reload old trace/output and lose conversational state.

💡 Suggested fix

-        if (
-            self.base_adapter_config.allow_saving
-            and Config.shared().autosave_runs
-            and self.task.path is not None
-        ):
+        if existing_run is not None and (
+            not self.base_adapter_config.allow_saving
+            or not Config.shared().autosave_runs
+            or self.task.path is None
+        ):
+            raise ValueError(
+                "Session continuation requires persistence (allow_saving, autosave_runs, and task.path)."
+            )
+
+        if (
+            self.base_adapter_config.allow_saving
+            and Config.shared().autosave_runs
+            and self.task.path is not None
+        ):
             run.save_to_file()
         elif existing_run is None:
             # Clear the ID to indicate it's not persisted
             run.id = None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@libs/core/kiln_ai/adapters/model_adapters/base_adapter.py` around lines 253 -
261, The current continuation logic can return a run with a persisted-looking ID
when save was skipped; update the block around base_adapter_config.allow_saving
/ Config.shared().autosave_runs / self.task.path so that whenever
run.save_to_file() is NOT called you clear run.id (e.g., set run.id = None)
regardless of existing_run, and only preserve the ID when save_to_file()
actually succeeded; adjust the branch around run.save_to_file() and the
existing_run check to ensure run.id reflects actual persistence state.

♻️ Duplicate comments (1)

libs/server/kiln_server/test_run_api.py (1)
1736-1778: ⚠️ Potential issue | 🟠 Major

Remove the machine-specific fixture path (and unused setup if not referenced).

Line 1740 hardcodes /Users/leonardmarcq/Downloads/..., which is non-portable and will fail outside one machine. Also, tmp_path is unused in this fixture, and this setup appears unused in this file.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/server/kiln_server/test_run_api.py` around lines 1736 - 1778, The
fixture adapter_sanity_check_setup hardcodes a machine-specific path in
project_path and never uses the tmp_path param; replace the hardcoded
Path("/Users/leonardmarcq/Downloads/…") with a portable path using the provided
tmp_path (e.g., tmp_path / "adapter_sanity_project" / "project.kiln") or remove
the fixture entirely if unused; update references inside
adapter_sanity_check_setup (project_path, project.save_to_file(),
Project.load_from_file(), and config._settings modification) to operate on the
tmp_path-based location, and remove the unused tmp_path parameter only if you
delete the fixture after confirming it is not referenced elsewhere.

🧹 Nitpick comments (2)

app/web_ui/src/lib/api_schema.d.ts (1)
6235-6239: Clarify run mutability in the task_run_id API description.

Line 6237 explains trace appending, but it does not explicitly state that continuing a session can mutate the referenced TaskRun’s latest output/intermediate_outputs. Please expand the backend OpenAPI field description to include this side effect, then regenerate this file.

Based on learnings: app/web_ui/src/lib/api_schema.d.ts is generated by openapi-typescript; do not propose manual edits. Schema changes should be made in the FastAPI backend (e.g., app/desktop/studio_server/data_gen_api.py or libs/server/kiln_server/*), then re-generate the TS types.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/web_ui/src/lib/api_schema.d.ts` around lines 6235 - 6239, The OpenAPI
description for the query/body field task_run_id must explicitly state that
continuing a session will append to and may mutate the referenced TaskRun’s
latest output and intermediate_outputs; update the FastAPI schema/Field
description that defines task_run_id (look for the parameter/Field named
"task_run_id" used by the endpoint in data_gen_api.py or the Pydantic model
representing TaskRun in the kiln_server modules) to include a sentence like
“Continuing a session will append to the run’s trace and may update the
TaskRun’s latest output and intermediate_outputs,” then re-generate the
TypeScript types (openapi-typescript) so app/web_ui/src/lib/api_schema.d.ts is
updated.
libs/core/kiln_ai/adapters/chat/chat_formatter.py (1)
96-99: Resolve the initial_messages TODO typing debt before merge.

Line 98 and Line 269 leave Any + TODOs in the core formatter interface. This weakens type safety across the new multiturn path and keeps pipeline warnings unresolved.

I can draft a small follow-up patch that introduces a shared concrete message-seed type alias and removes both TODOs.

Also applies to: 267-271
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/core/kiln_ai/adapters/chat/chat_formatter.py` around lines 96 - 99,
Introduce a concrete seed-message type and use it instead of Any: define a
shared type alias/TypedDict (e.g., MessageSeed or ChatSeedMessage with required
fields like role: str, content: str and optional metadata: dict[str, Any]) and
replace occurrences of list[Any] in initial_messages with list[MessageSeed];
update the signature of the initial_messages method in chat_formatter.py (and
the matching spot around lines 267-271) to return list[MessageSeed], remove the
TODO comments, and update any imports/annotations to reference the new alias so
the multiturn path uses the concrete message-seed type.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@libs/core/kiln_ai/adapters/model_adapters/base_adapter.py`:
- Around line 253-261: The current continuation logic can return a run with a
persisted-looking ID when save was skipped; update the block around
base_adapter_config.allow_saving / Config.shared().autosave_runs /
self.task.path so that whenever run.save_to_file() is NOT called you clear
run.id (e.g., set run.id = None) regardless of existing_run, and only preserve
the ID when save_to_file() actually succeeded; adjust the branch around
run.save_to_file() and the existing_run check to ensure run.id reflects actual
persistence state.

---

Duplicate comments:
In `@libs/server/kiln_server/test_run_api.py`:
- Around line 1736-1778: The fixture adapter_sanity_check_setup hardcodes a
machine-specific path in project_path and never uses the tmp_path param; replace
the hardcoded Path("/Users/leonardmarcq/Downloads/…") with a portable path using
the provided tmp_path (e.g., tmp_path / "adapter_sanity_project" /
"project.kiln") or remove the fixture entirely if unused; update references
inside adapter_sanity_check_setup (project_path, project.save_to_file(),
Project.load_from_file(), and config._settings modification) to operate on the
tmp_path-based location, and remove the unused tmp_path parameter only if you
delete the fixture after confirming it is not referenced elsewhere.

---

Nitpick comments:
In `@app/web_ui/src/lib/api_schema.d.ts`:
- Around line 6235-6239: The OpenAPI description for the query/body field
task_run_id must explicitly state that continuing a session will append to and
may mutate the referenced TaskRun’s latest output and intermediate_outputs;
update the FastAPI schema/Field description that defines task_run_id (look for
the parameter/Field named "task_run_id" used by the endpoint in data_gen_api.py
or the Pydantic model representing TaskRun in the kiln_server modules) to
include a sentence like “Continuing a session will append to the run’s trace and
may update the TaskRun’s latest output and intermediate_outputs,” then
re-generate the TypeScript types (openapi-typescript) so
app/web_ui/src/lib/api_schema.d.ts is updated.

In `@libs/core/kiln_ai/adapters/chat/chat_formatter.py`:
- Around line 96-99: Introduce a concrete seed-message type and use it instead
of Any: define a shared type alias/TypedDict (e.g., MessageSeed or
ChatSeedMessage with required fields like role: str, content: str and optional
metadata: dict[str, Any]) and replace occurrences of list[Any] in
initial_messages with list[MessageSeed]; update the signature of the
initial_messages method in chat_formatter.py (and the matching spot around lines
267-271) to return list[MessageSeed], remove the TODO comments, and update any
imports/annotations to reference the new alias so the multiturn path uses the
concrete message-seed type.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ee13268 and 3f83187.

📒 Files selected for processing (16)

app/web_ui/src/lib/api_schema.d.ts
libs/core/kiln_ai/adapters/chat/__init__.py
libs/core/kiln_ai/adapters/chat/chat_formatter.py
libs/core/kiln_ai/adapters/chat/test_chat_formatter.py
libs/core/kiln_ai/adapters/model_adapters/base_adapter.py
libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py
libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_mcp_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py
libs/core/kiln_ai/adapters/model_adapters/test_structured_output.py
libs/core/kiln_ai/adapters/test_prompt_builders.py
libs/core/kiln_ai/datamodel/test_basemodel.py
libs/server/kiln_server/run_api.py
libs/server/kiln_server/test_run_api.py

🚧 Files skipped from review as they are similar to previous changes (4)

libs/core/kiln_ai/adapters/model_adapters/test_structured_output.py
libs/core/kiln_ai/adapters/chat/init.py
libs/server/kiln_server/run_api.py
libs/core/kiln_ai/adapters/chat/test_chat_formatter.py

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

libs/server/kiln_server/test_run_api.py (1)
1816-1826: ⚠️ Potential issue | 🟠 Major

Remove the hardcoded local path from the sanity fixture.

Line 1820 hardcodes /Users/leonardmarcq/..., which is non-portable, and this fixture still takes tmp_path without using it. This will fail outside one machine setup and adds brittle test infra.

Also applies to: 1838-1850
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/server/kiln_server/test_run_api.py` around lines 1816 - 1826, The
fixture adapter_sanity_check_setup contains a hardcoded absolute path
(/Users/...) and ignores the tmp_path fixture; replace that hardcoded Path usage
with a test-local path derived from the tmp_path fixture (e.g., tmp_path /
"adapter_sanity_project" / "project.kiln"), ensure parent directories are
created and the project file is initialized as before, and do the same change
for the duplicate block around the adapter_sanity_check_setup sibling (the
second sanity fixture in the file); update any references that expect
project_path so they use the new tmp_path-based Path.

🧹 Nitpick comments (1)

libs/server/kiln_server/test_run_api.py (1)
1990-1993: Prefer normalized/structured assertion over exact LLM string equality.

Line 1993 asserts exact text ("[4, 12, 59]"), which is fragile for provider formatting variance (spacing/newlines). Parse or normalize before asserting.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/server/kiln_server/test_run_api.py` around lines 1990 - 1993, The test
is brittle because it asserts exact LLM output string equality; change the
assertion to parse/normalize the output before comparing: take
res4["output"]["output"] (where response4/res4 and task_run_id are used) and
convert it into a structured Python list (e.g., via json.loads or
ast.literal_eval after stripping whitespace/newlines) and then assert that the
resulting list equals [4, 12, 59] and that res4["id"] == task_run_id remains.
This ensures formatting differences (spaces/newlines) won't break the test.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libs/core/kiln_ai/adapters/model_adapters/base_adapter.py`:
- Around line 406-414: The current code merges previous and new intermediate
outputs (using merged_intermediate and existing_run.intermediate_outputs), which
can retain stale keys; instead, replace the merge with a direct assignment so
the latest assistant message wins: set existing_run.intermediate_outputs =
dict(run_output.intermediate_outputs or {}) (or None if you prefer explicit
clearing when there are no intermediate outputs) and remove the
merged_intermediate construction and usage so only
run_output.intermediate_outputs is stored.

In `@libs/server/kiln_server/run_api.py`:
- Around line 295-309: The continuation path reads and then mutates TaskRun (via
TaskRun.from_id_and_parent_path and later adapter.invoke using existing_run)
without synchronization, so concurrent continuations can clobber appended
trace/output; wrap the read/modify/write sequence with a guard (e.g., a per-run
async lock keyed by request.task_run_id or a DB transaction/row-level lock) to
serialize continuations for the same run before calling adapter.invoke, ensuring
only one coroutine can mutate existing_run at a time and releasing the lock
after the update completes or errors.
- Line 309: Wrap the call to adapter.invoke (the line returning await
adapter.invoke(input, existing_run=existing_run)) in a try/except that catches
NotImplementedError and re-raises a FastAPI HTTPException so clients receive a
4xx response; e.g. catch NotImplementedError as e and raise
HTTPException(status_code=400, detail=str(e)) (or use
status.HTTP_400_BAD_REQUEST) from e, and add the necessary import for
HTTPException (and status if used).

---

Duplicate comments:
In `@libs/server/kiln_server/test_run_api.py`:
- Around line 1816-1826: The fixture adapter_sanity_check_setup contains a
hardcoded absolute path (/Users/...) and ignores the tmp_path fixture; replace
that hardcoded Path usage with a test-local path derived from the tmp_path
fixture (e.g., tmp_path / "adapter_sanity_project" / "project.kiln"), ensure
parent directories are created and the project file is initialized as before,
and do the same change for the duplicate block around the
adapter_sanity_check_setup sibling (the second sanity fixture in the file);
update any references that expect project_path so they use the new
tmp_path-based Path.

---

Nitpick comments:
In `@libs/server/kiln_server/test_run_api.py`:
- Around line 1990-1993: The test is brittle because it asserts exact LLM output
string equality; change the assertion to parse/normalize the output before
comparing: take res4["output"]["output"] (where response4/res4 and task_run_id
are used) and convert it into a structured Python list (e.g., via json.loads or
ast.literal_eval after stripping whitespace/newlines) and then assert that the
resulting list equals [4, 12, 59] and that res4["id"] == task_run_id remains.
This ensures formatting differences (spaces/newlines) won't break the test.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3f83187 and 04748e4.

📒 Files selected for processing (10)

libs/core/kiln_ai/adapters/chat/__init__.py
libs/core/kiln_ai/adapters/chat/chat_formatter.py
libs/core/kiln_ai/adapters/model_adapters/base_adapter.py
libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py
libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_mcp_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py
libs/server/kiln_server/run_api.py
libs/server/kiln_server/test_run_api.py

🚧 Files skipped from review as they are similar to previous changes (1)

libs/core/kiln_ai/adapters/model_adapters/test_mcp_adapter.py

libs/core/kiln_ai/adapters/model_adapters/base_adapter.py

coderabbitai · 2026-02-27T10:46:53Z

libs/server/kiln_server/run_api.py

+            existing_run = TaskRun.from_id_and_parent_path(
+                request.task_run_id, task.path
+            )
+            if existing_run is None:
+                raise HTTPException(
+                    status_code=404,
+                    detail="Run not found. Cannot continue session.",
+                )
+            if not existing_run.trace or len(existing_run.trace) == 0:
+                raise HTTPException(
+                    status_code=400,
+                    detail="Run has no trace. Cannot continue session without conversation history.",
+                )
+
+        return await adapter.invoke(input, existing_run=existing_run)


⚠️ Potential issue | 🟠 Major

Guard continuation writes with a lock to prevent lost updates.

Line 31 already documents this load/update/write pattern as non-atomic, but Lines 295-309 perform continuation mutation without synchronization. Concurrent continuation calls for the same run can overwrite each other’s appended trace/output.

🔧 Suggested fix

- return await adapter.invoke(input, existing_run=existing_run) + if existing_run is not None: + async with update_run_lock: + return await adapter.invoke(input, existing_run=existing_run) + return await adapter.invoke(input, existing_run=None)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@libs/server/kiln_server/run_api.py` around lines 295 - 309, The continuation path reads and then mutates TaskRun (via TaskRun.from_id_and_parent_path and later adapter.invoke using existing_run) without synchronization, so concurrent continuations can clobber appended trace/output; wrap the read/modify/write sequence with a guard (e.g., a per-run async lock keyed by request.task_run_id or a DB transaction/row-level lock) to serialize continuations for the same run before calling adapter.invoke, ensuring only one coroutine can mutate existing_run at a time and releasing the lock after the update completes or errors.

coderabbitai · 2026-02-27T10:46:53Z

libs/server/kiln_server/run_api.py

+                    detail="Run has no trace. Cannot continue session without conversation history.",
+                )
+
+        return await adapter.invoke(input, existing_run=existing_run)


⚠️ Potential issue | 🟠 Major

Map unsupported continuation to a client-facing HTTP error.

On Line 309, adapter-level NotImplementedError (for example MCP continuation) can bubble as a server error. Convert it to an explicit HTTPException so clients get a clear 4xx response.

🔧 Suggested fix

- return await adapter.invoke(input, existing_run=existing_run) + try: + return await adapter.invoke(input, existing_run=existing_run) + except NotImplementedError as e: + raise HTTPException(status_code=400, detail=str(e)) from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

return await adapter.invoke(input, existing_run=existing_run)

try:

return await adapter.invoke(input, existing_run=existing_run)

except NotImplementedError as e:

raise HTTPException(status_code=400, detail=str(e)) from e

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@libs/server/kiln_server/run_api.py` at line 309, Wrap the call to adapter.invoke (the line returning await adapter.invoke(input, existing_run=existing_run)) in a try/except that catches NotImplementedError and re-raises a FastAPI HTTPException so clients receive a 4xx response; e.g. catch NotImplementedError as e and raise HTTPException(status_code=400, detail=str(e)) (or use status.HTTP_400_BAD_REQUEST) from e, and add the necessary import for HTTPException (and status if used).

scosman · 2026-03-05T14:32:35Z

TaskRun are no longer immutable

This makes sense. But it also breaks one of the core ideas behind the collaboration with git concept where the files are small, mostly immutable, and unlikely to be edited by others, so unlikely to hit merge conflicts.

We could add a tree/fork concept where new runs are branched off a parent. That way 2 people editing end up with 2 chains. This adds complexity when rendering the dataset list (need to filter items that have children, only show the leaf nodes), but seems manageable?

As this PR is SDK only, and persisting is off by default in SDK I say leave it as is for now, debate above later when we add to app=?

scosman · 2026-03-05T15:10:01Z

libs/server/kiln_server/run_api.py


-        return await adapter.invoke(input)
+        existing_run: TaskRun | None = None
+        if request.task_run_id is not None:


maybe just remove the run_api support for now? Wait until we add it to app, and can design the API then?

Most SDK users won't have runs saved, so this will just error (off by default now).

P2: this works too, but might want different design later when doing app so might just make sense to wait.

scosman · 2026-03-05T15:13:25Z

libs/core/kiln_ai/adapters/model_adapters/base_adapter.py

        self,
        input: InputType,
        input_source: DataSource | None = None,
+        existing_run: TaskRun | None = None,


naming maybe one of: continue_from? prior_task_run? parent_task_run?

scosman · 2026-03-05T15:18:36Z

libs/core/kiln_ai/adapters/model_adapters/base_adapter.py

-            input, input_source, parsed_output, usage, run_output.trace
-        )
+        # Create the run and output - merge if there is an existing run
+        if existing_run is not None:


see comment here: #1088 (comment)

I'm leaning more towards always saving a new task_run, and just setting a new parent_id field on the child? Keep it immutable.

Will add a bit of work in UI, but much more robust for collisions.

…dapter-support-multiturn

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

libs/server/kiln_server/test_run_api.py (1)
1686-1729: ⚠️ Potential issue | 🟠 Major

Remove unused fixture with hardcoded user-specific path.

This fixture contains a hardcoded path (/Users/leonardmarcq/Downloads/...) that will fail on CI and other machines. The tmp_path parameter is accepted but never used. No test in this file references adapter_sanity_check_setup.

Either remove this fixture entirely, or refactor it to use tmp_path like adapter_sanity_check_math_tools_setup does.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/server/kiln_server/test_run_api.py` around lines 1686 - 1729, The
fixture adapter_sanity_check_setup contains a hard-coded user path and never
uses the tmp_path param (and is unused by tests); remove this fixture entirely
or refactor it to use tmp_path: replace the hardcoded Path("/Users/...") with
tmp_path / "adapter_sanity_project" / "project.kiln", ensure Project/Task are
created/loaded from that temp path, update Config.shared() modifications to use
the temp project.path, and keep the teardown that restores
config._settings["projects"]; if no tests reference adapter_sanity_check_setup
simply delete the fixture to avoid CI breakage.

🧹 Nitpick comments (2)

libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py (1)

13-17: Consider adding type hint for prior_trace parameter.

The base class defines prior_trace: list[ChatCompletionMessageParam] | None, but the mock omits the type. Adding it improves consistency.

♻️ Suggested fix

+from kiln_ai.utils.open_ai_types import ChatCompletionMessageParam
+
 class MockAdapter(BaseAdapter):
     async def _run(
         self,
         input: InputType,
-        prior_trace=None,
+        prior_trace: list[ChatCompletionMessageParam] | None = None,
     ) -> tuple[RunOutput, Usage | None]:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py`
around lines 13 - 17, The mock async method _run is missing the explicit type
for the prior_trace parameter; update its signature to match the base class by
annotating prior_trace as list[ChatCompletionMessageParam] | None (or the
appropriate alias used in the codebase) so the mock matches the base typing for
_run and tools like static checkers; ensure any required imports or forward
references for ChatCompletionMessageParam are present.

libs/server/kiln_server/test_run_api.py (1)

1676-1684: Remove unused helper functions.

_adapter_sanity_check_output_path() and _append_to_sanity_check() are defined but never called. Additionally, _adapter_sanity_check_output_path() writes to the source directory rather than a temporary location, which would pollute the source tree.

Suggested removal

-def _adapter_sanity_check_output_path() -> Path:
-    return Path(__file__).resolve().parent / "adapter_sanity_check.txt"
-
-
-def _append_to_sanity_check(content: str, output_path: Path) -> None:
-    with open(output_path, "a", encoding="utf-8") as f:
-        f.write(content)
-        f.write("\n")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@libs/server/kiln_server/test_run_api.py` around lines 1676 - 1684, Remove the
unused helper functions _adapter_sanity_check_output_path() and
_append_to_sanity_check() from the file: these helpers are never called and the
path helper writes into the source tree; delete both definitions and update any
tests that relied on them to instead use a temporary path (e.g., pytest
tmp_path) or test-local fixtures if output needs to be captured. Ensure there
are no remaining imports or references to _adapter_sanity_check_output_path or
_append_to_sanity_check in the module after removal.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py`:
- Around line 353-397: The test
test_generate_run_with_existing_run_merges_usage_and_intermediate_outputs
expects merging of intermediate_outputs but the PR changes specify that
intermediate_outputs should be taken from the latest assistant message (i.e.,
replaced, not merged); update the assertions in this test (which calls
adapter.generate_run with existing_run=initial_run and
RunOutput.intermediate_outputs on the second call) to assert that
result.intermediate_outputs equals only the latest RunOutput
intermediate_outputs ({"new_key":"new_val"}) instead of containing the old
"chain_of_thought" key, keeping other assertions (id change, input, usage
accumulation, and output) as-is.

---

Duplicate comments:
In `@libs/server/kiln_server/test_run_api.py`:
- Around line 1686-1729: The fixture adapter_sanity_check_setup contains a
hard-coded user path and never uses the tmp_path param (and is unused by tests);
remove this fixture entirely or refactor it to use tmp_path: replace the
hardcoded Path("/Users/...") with tmp_path / "adapter_sanity_project" /
"project.kiln", ensure Project/Task are created/loaded from that temp path,
update Config.shared() modifications to use the temp project.path, and keep the
teardown that restores config._settings["projects"]; if no tests reference
adapter_sanity_check_setup simply delete the fixture to avoid CI breakage.

---

Nitpick comments:
In `@libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py`:
- Around line 13-17: The mock async method _run is missing the explicit type for
the prior_trace parameter; update its signature to match the base class by
annotating prior_trace as list[ChatCompletionMessageParam] | None (or the
appropriate alias used in the codebase) so the mock matches the base typing for
_run and tools like static checkers; ensure any required imports or forward
references for ChatCompletionMessageParam are present.

In `@libs/server/kiln_server/test_run_api.py`:
- Around line 1676-1684: Remove the unused helper functions
_adapter_sanity_check_output_path() and _append_to_sanity_check() from the file:
these helpers are never called and the path helper writes into the source tree;
delete both definitions and update any tests that relied on them to instead use
a temporary path (e.g., pytest tmp_path) or test-local fixtures if output needs
to be captured. Ensure there are no remaining imports or references to
_adapter_sanity_check_output_path or _append_to_sanity_check in the module after
removal.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ece44a3f-dca5-4fae-a601-325afa5c5f54

📥 Commits

Reviewing files that changed from the base of the PR and between 04748e4 and e71efe6.

📒 Files selected for processing (3)

libs/core/kiln_ai/adapters/model_adapters/base_adapter.py
libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py
libs/server/kiln_server/test_run_api.py

libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py

leonardmq · 2026-03-09T20:12:29Z

Taken over by #1107

leonardmq · 2026-03-09T20:12:38Z

Taken over by #1107

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

gemini-code-assist bot reviewed Feb 26, 2026

View reviewed changes

refactor: support multiturn (existing conversation history with task …

3f83187

…run as session)

leonardmq force-pushed the leonard/kil-421-adapter-support-multiturn branch from ee13268 to 3f83187 Compare February 26, 2026 11:44

leonardmq marked this pull request as draft February 26, 2026 11:45

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

leonardmq added 2 commits February 27, 2026 15:21

refactor: use correct typing in chat formatter

6c7cc58

refactor: retrieve task_run one level up

04748e4

leonardmq marked this pull request as ready for review February 27, 2026 10:40

coderabbitai bot reviewed Feb 27, 2026

View reviewed changes

leonardmq requested a review from scosman February 27, 2026 10:57

leonardmq mentioned this pull request Mar 3, 2026

feat: streaming + multiturn #1100

Closed

2 tasks

scosman reviewed Mar 5, 2026

View reviewed changes

leonardmq added 2 commits March 6, 2026 17:41

Merge branch 'main' of github.com:Kiln-AI/Kiln into leonard/kil-421-a…

c75e7d7

…dapter-support-multiturn

refactor: always create a new run

e71efe6

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py Show resolved Hide resolved

leonardmq requested a review from scosman March 6, 2026 10:01

fix: coderabbit feedback

855e451

leonardmq closed this Mar 9, 2026

Conversation

leonardmq commented Feb 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklists

Summary by CodeRabbit

Release Notes

Uh oh!

gemini-code-assist bot commented Feb 26, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Coverage Report

Diff: origin/main...HEAD

Summary

Line-by-line

libs/core/kiln_ai/adapters/chat/chat_formatter.py

libs/core/kiln_ai/adapters/model_adapters/mcp_adapter.py

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

scosman commented Mar 5, 2026

Uh oh!

scosman Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

scosman Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

leonardmq commented Feb 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 26, 2026 •

edited

Loading

github-actions bot commented Feb 26, 2026 •

edited

Loading

scosman Mar 5, 2026 •

edited

Loading