Dataset with tools shouldn't be included in a fine tune when no tools are selected. #1137
Dataset with tools shouldn't be included in a fine tune when no tools are selected. #1137chiang-daniel merged 11 commits intomainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds explicit empty-tool filtering across finetune flows: backend endpoint gains Changes
Sequence Diagram(s)(Skipped) Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the accuracy and consistency of dataset selection for fine-tuning, particularly concerning tool-based filtering. It addresses a critical scenario where datasets containing tools might have been erroneously included when a user explicitly requested datasets without any tools. The changes involve a more nuanced interpretation of tool filter parameters in the backend API, a client-side mechanism to correctly transmit 'no tools selected' states, and an update to the data model to better represent tool information within datasets, especially in cases of tool mismatches. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📊 Coverage ReportOverall Coverage: 91% Diff: origin/main...HEAD
Summary
|
There was a problem hiding this comment.
Code Review
This pull request correctly implements a crucial distinction between having no tool filter (None) and explicitly filtering for datasets with no tools ([]). The changes are well-implemented across the Python backend, Svelte frontend, and data models. The introduction of empty_tool_filter is a smart workaround for frontend limitations, and the data model change in DatasetToolInfo to use None for mismatched tools is a good semantic improvement. Overall, this is a solid PR that improves the correctness and clarity of the fine-tuning API. I have one suggestion for a minor refactoring to improve code conciseness.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/desktop/studio_server/finetune_api.py`:
- Line 546: The current call replaces tool_info.tools=None with [] losing the
mixed-vs-empty distinction; change the call to pass tool_info.tools through
unchanged (i.e., remove the "or []" collapse) so load_skills_from_tool_ids
receives None for mixed-tool datasets and [] only when the source truly provides
an empty list, or alternatively update load_skills_from_tool_ids to accept and
distinguish None vs []; specifically, modify the invocation at the top-level
(the expression producing skills_dict) to pass tool_info.tools as-is and ensure
load_skills_from_tool_ids's logic handles None as "mixed" and [] as "no tools".
In
`@app/web_ui/src/routes/`(app)/generate/[project_id]/[task_id]/synth/+page.svelte:
- Around line 229-237: The URL parsing currently maps an explicit
fine_tuning_tools= to [] via has_fine_tuning_tools / fine_tuning_tools_list but
the URL-vs-saved-state equality check omits fine_tuning_tools, so a saved state
with fine_tuning_tools=null can be treated as "same" and override the URL;
update the state-comparison logic to include the fine_tuning_tools semantics
(use has_fine_tuning_tools and/or fine_tuning_tools_list) so that an explicit
empty list (has_fine_tuning_tools=true and fine_tuning_tools_list==[]) is
considered different from null and preserved, and apply the same change to the
equivalent block around the other occurrence (lines ~255-260).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9141143d-fc0c-423c-8a52-bee9c33a00ef
📒 Files selected for processing (10)
app/desktop/studio_server/finetune_api.pyapp/desktop/studio_server/test_finetune_api.pyapp/web_ui/src/lib/api_schema.d.tsapp/web_ui/src/routes/(app)/dataset/[project_id]/[task_id]/[run_id]/run/+page.svelteapp/web_ui/src/routes/(app)/dataset/[project_id]/[task_id]/add_data/+page.svelteapp/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelteapp/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/select_finetune_dataset.svelteapp/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.sveltelibs/core/kiln_ai/datamodel/dataset_split.pylibs/core/kiln_ai/datamodel/test_dataset_split.py
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte
Show resolved
Hide resolved
….com/Kiln-AI/Kiln into dchiang/filter-ft-dataset-with-skill
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
app/desktop/studio_server/finetune_api.py (1)
545-551: Verify the mixed-dataset export guard only blocks generated prompts.When
custom_system_messageis present, this endpoint never needsskills—system_message_from_request()ignores them andDatasetFormatteronly consumes the resolved string. The unconditional 400 therefore blocks a path that otherwise has enough information to export. If mixed datasets are only unsupported when Kiln has to infer skills, gate the error to the generator path instead.💡 Possible adjustment
tool_info = dataset.tool_info() - if tool_info.tools is None: + if tool_info.tools is None and not custom_system_message: raise HTTPException( status_code=400, detail="Dataset contains mixed tool/skill selections and cannot be exported", ) - skills_dict = load_skills_from_tool_ids(task, tool_info.tools) + skills_dict = ( + {} + if tool_info.tools is None + else load_skills_from_tool_ids(task, tool_info.tools) + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/desktop/studio_server/finetune_api.py` around lines 545 - 551, The current guard always raises HTTPException when tool_info.tools is None, which blocks exports even when a custom_system_message is provided; change the check to only reject mixed tool/skill datasets when the generator path is used (i.e., when no custom_system_message is present). Concretely, in the block using dataset.tool_info(), only raise the 400 if tool_info.tools is None AND custom_system_message is falsy, and only call load_skills_from_tool_ids(task, tool_info.tools) when custom_system_message is falsy so that system_message_from_request() / DatasetFormatter can proceed without skills.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@app/web_ui/src/routes/`(app)/generate/[project_id]/[task_id]/synth/+page.svelte:
- Around line 229-241: The comparison is order-sensitive because
fine_tuning_tools_list is joined without normalization; sort the list before
creating fine_tuning_tools_key so that order changes don't alter the key (i.e.,
compute fine_tuning_tools_list, then sort it, then set fine_tuning_tools_key =
fine_tuning_tools_list === null ? null : fine_tuning_tools_list.join(","));
apply the same normalization (sort before join) to the analogous skills/tools
keys referenced around the other block (the symbols to update are
fine_tuning_tools_list, fine_tuning_tools_key and the corresponding skills
list/key used at lines ~263-267).
---
Nitpick comments:
In `@app/desktop/studio_server/finetune_api.py`:
- Around line 545-551: The current guard always raises HTTPException when
tool_info.tools is None, which blocks exports even when a custom_system_message
is provided; change the check to only reject mixed tool/skill datasets when the
generator path is used (i.e., when no custom_system_message is present).
Concretely, in the block using dataset.tool_info(), only raise the 400 if
tool_info.tools is None AND custom_system_message is falsy, and only call
load_skills_from_tool_ids(task, tool_info.tools) when custom_system_message is
falsy so that system_message_from_request() / DatasetFormatter can proceed
without skills.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 526f4ca0-b2c5-471d-ab4c-de6e1eaf3848
📒 Files selected for processing (3)
app/desktop/studio_server/finetune_api.pyapp/desktop/studio_server/test_finetune_api.pyapp/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte
Show resolved
Hide resolved
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte (1)
238-241:⚠️ Potential issue | 🟡 MinorNormalize
fine_tuning_toolsbefore comparing state.This
join(",")check is order-sensitive, so the same inherited tool/skill set in a different order still looks like a different session and triggers the replace-session dialog.💡 Suggested change
+ const normalizeFineTuningTools = (tools: string[] | null) => + tools === null ? null : [...tools].sort().join(",") + - const fine_tuning_tools_key = - fine_tuning_tools_list === null - ? null - : fine_tuning_tools_list.join(",") + const fine_tuning_tools_key = + normalizeFineTuningTools(fine_tuning_tools_list) @@ - ($saved_state.fine_tuning_tools === null - ? null - : $saved_state.fine_tuning_tools.join(",")) === - fine_tuning_tools_key + normalizeFineTuningTools($saved_state.fine_tuning_tools) === + fine_tuning_tools_keyAlso applies to: 264-267
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/web_ui/src/routes/`(app)/generate/[project_id]/[task_id]/synth/+page.svelte around lines 238 - 241, The comparison uses fine_tuning_tools_list.join(",") which is order-sensitive; normalize the list before serializing by sorting or otherwise canonicalizing it (e.g., use fine_tuning_tools_list.slice().sort() then join) so identical sets in different orders produce the same fine_tuning_tools_key; update the creation of fine_tuning_tools_key and the analogous usage around the other occurrence (the block referenced at lines 264-267) to use the same normalized sorting approach and keep null handling unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@app/web_ui/src/routes/`(app)/generate/[project_id]/[task_id]/synth/+page.svelte:
- Around line 758-770: The selectors retain stale selections when
mandatory_tools/mandatory_skills becomes an empty array because load_tools()
only watches project_id/task_id; in tools_selector.svelte and
skills_selector.svelte add a reactive statement that sets tools = [] (and skills
= [] respectively) when the corresponding mandatory_tools/mandatory_skills is an
array with length 0 and the locked/disabled flag is true; locate the variables
and existing load_tools() logic (e.g., the load_tools() watcher around line 41)
and add the clear-on-empty reactive check so the locked selector emits an
explicit empty selection instead of stale values.
---
Duplicate comments:
In
`@app/web_ui/src/routes/`(app)/generate/[project_id]/[task_id]/synth/+page.svelte:
- Around line 238-241: The comparison uses fine_tuning_tools_list.join(",")
which is order-sensitive; normalize the list before serializing by sorting or
otherwise canonicalizing it (e.g., use fine_tuning_tools_list.slice().sort()
then join) so identical sets in different orders produce the same
fine_tuning_tools_key; update the creation of fine_tuning_tools_key and the
analogous usage around the other occurrence (the block referenced at lines
264-267) to use the same normalized sorting approach and keep null handling
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 2f9905ad-e106-474c-b497-2b137d87969d
📒 Files selected for processing (1)
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte`:
- Around line 45-54: The current reactive block clears selections whenever
tools_selector_settings.disabled is true and mandatory_tools resolves to an
empty array (but mandatory_tools defaults to []), which can wipe valid
selections; change the condition to only treat an "empty lock" as explicit when
the caller actually provided mandatory_tools (e.g.,
tools_selector_settings.mandatory_tools !== undefined) — so update the reactive
statement that references tools_selector_settings and mandatory_tools to check
that mandatory_tools is defined (not just an empty array) before clearing tools,
and ensure you don't persist the cleared value unless this explicit-empty
condition is met.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: ab9d57cd-d3f5-4f68-91db-982c68b03899
📒 Files selected for processing (2)
app/web_ui/src/lib/ui/run_config_component/skills_selector.svelteapp/web_ui/src/lib/ui/run_config_component/tools_selector.svelte
What does this PR do?
Previously, leaving tools and skills empty in the fine-tune flow could still include runs with tools and allow reuse of datasets that were not actually tool-free. This was caused by empty selections not being preserved consistently through the UI/API flow, and by mixed tool/skill datasets being collapsed into the same representation as truly tool-free datasets.
Fixed by making empty selections explicit end-to-end, distinguishing mismatched datasets from empty tool sets, and preserving inherited empty tool/skill constraints in SDG
Checklists
Summary by CodeRabbit
New Features
Bug Fixes
Tests