Skip to content

feat(ai): defer marimo-pair references and improve code mode prompts#9985

Merged
Light2Dark merged 2 commits into
mainfrom
sham/prompt-improvements
Jun 26, 2026
Merged

feat(ai): defer marimo-pair references and improve code mode prompts#9985
Light2Dark merged 2 commits into
mainfrom
sham/prompt-improvements

Conversation

@Light2Dark

@Light2Dark Light2Dark commented Jun 25, 2026

Copy link
Copy Markdown
Member

📝 Summary

image

Code mode had no way to load deeper reference docs without bloating the system prompt. I split the reference material into deferred pydantic-ai capabilities so the model pulls in gotchas, notebook-improvements, and rich-representations only when needed.

  • Bundle three reference markdown files and expose them as deferred Capability objects wired into the code-mode agent harness
  • Update SKILL.md and reference docs to tell the model to use load_capability instead of reading files from disk
  • Streamline the code-mode system prompt: drop per-language rules and injected cell code (the model can inspect the notebook via execute_code), and add a nudge to inspect notebook state first
  • Expose code mode in the chat mode picker outside dev builds

📋 Pre-Review Checklist

  • For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
  • Any AI generated code has been reviewed line-by-line by the human PR author, who stands by it.
  • Video or media evidence is provided for any visual changes (optional).

✅ Merge Checklist

  • I have read the contributor guidelines.
  • Documentation has been updated where applicable, including docstrings for API changes.
  • Tests have been added for the changes made.

Made with Cursor

Review in cubic

Load skill reference docs as on-demand pydantic-ai capabilities to keep
code mode context lean, streamline the system prompt, and expose code mode
in the chat UI.

Co-authored-by: Cursor <cursoragent@cursor.com>
@Light2Dark Light2Dark added the enhancement New feature or request label Jun 25, 2026
@vercel

vercel Bot commented Jun 25, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Jun 25, 2026 3:50pm

Request Review

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 13 files

Architecture diagram
sequenceDiagram
    participant Client as Chat Client (Frontend)
    participant API as API Route (stream_completion)
    participant Provider as AI Provider (stream_completion_harness)
    participant Agent as pydantic-ai Agent
    participant Capability as Deferred Capability
    participant Tool as execute_code Tool
    participant Session as Kernel Session
    participant SkillLoader as Skill/Reference Loader

    Note over Client,SkillLoader: Code Mode Chat Flow - NEW: Chat Mode Picker Unrestricted

    Client->>API: POST /api/ai/completion (mode=code_mode)
    
    Note over API,Session: NEW: Code mode no longer receives cell code injection

    Provider->>Provider: Build system prompt (no language rules, no other cell code)

    Provider->>SkillLoader: Load SKILL.md (marimo-pair)
    SkillLoader-->>Provider: SKILL.md content (with load_capability instructions)

    Provider->>Provider: Create pydantic-ai model

    Provider->>Capability: NEW: references_capability()
    Capability->>SkillLoader: Load gotchas.md (deferred)
    Capability->>SkillLoader: Load notebook-improvements.md (deferred)
    Capability->>SkillLoader: Load rich-representations.md (deferred)
    SkillLoader-->>Capability: Reference content (cached, not inlined)

    Provider->>Agent: Create Agent(toolsets=[build_execute_code_toolset], capabilities=3 deferred refs, instructions=system_prompt)

    Provider->>Agent: Submit message

    alt Model needs reference material
        Agent->>Capability: load_capability("gotchas")
        Capability-->>Agent: Gotchas content (instructions)
        Agent->>Capability: load_capability("rich-representations")
        Capability-->>Agent: Rich representations content
    end

    alt Model needs to inspect/run code
        Agent->>Tool: execute_code(code)
        Tool->>Session: Run scratchpad code with credentials
        Session-->>Tool: Execution result
        Tool-->>Agent: CodeExecutionResult
    end

    Agent-->>Provider: Stream response tokens
    Provider->>API: StreamingResponse
    API-->>Client: Stream chat response

    Note over Client,Agent: NEW: Reference docs loaded on-demand, not inline in prompt
Loading

Re-trigger cubic

@Light2Dark Light2Dark marked this pull request as ready for review June 25, 2026 07:16
## 📝 Summary
<!--
If this PR closes any issues, list them here by number (e.g., Closes
#123).

Detail the specific changes made in this pull request. Explain the
problem addressed and how it was resolved. If applicable, provide before
and after comparisons, screenshots, or any relevant details to help
reviewers understand the changes easily.
-->
Previously, every /chat turn rebuilt AiCompletionContext from current
notebook state and appended it to the system prompt. Two problems:

- Prompt caching: dynamic context in the (otherwise static) system
prefix changed the cacheable prefix every turn, defeating provider
prompt caching.
- Correctness/cost: variables mentioned in turn 1 was re-resolved
against turn-N's live values and resent in full each request.

It's also possible to do this for /completion endpoints but because
those endpoints are not multi-turn convos today, the benefits are not
material.

### Approach

Adopt the canonical AI SDK pattern for custom structured context:

- Frontend resolves @-mentions once at send time (resolveChatContext)
and embeds the rendered string in the user message as a
data-marimo-context part (typed via the SDK's DataUIPart). Because it
lives in chat history, it's replayed verbatim on later turns — a
point-in-time snapshot of the rendered context, not a live
re-resolution.
- Server lowers that data part into a <context>…</context> text part
placed below the user's message before validation
(_expand_marimo_context_part), mirroring the SDK's convertDataPart. This
is required because pydantic-ai's VercelAIAdapter drops DataUIParts
entirely.


## 📋 Pre-Review Checklist
<!-- These checks need to be completed before a PR is reviewed -->

- [ ] For large changes, or changes that affect the public API: this
change was discussed or approved through an issue, on
[Discord](https://marimo.io/discord?ref=pr), or the community
[discussions](https://github.com/marimo-team/marimo/discussions) (Please
provide a link if applicable).
- [x] Any AI generated code has been reviewed line-by-line by the human
PR author, who stands by it.
- [ ] Video or media evidence is provided for any visual changes
(optional). <!-- PR is more likely to be merged if evidence is provided
for changes made -->

## ✅ Merge Checklist

- [x] I have read the [contributor
guidelines](https://github.com/marimo-team/marimo/blob/main/CONTRIBUTING.md).
- [ ] Documentation has been updated where applicable, including
docstrings for API changes.
- [x] Tests have been added for the changes made.

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions github-actions Bot added the bash-focus Area to focus on during release bug bash label Jun 25, 2026
def references_capability() -> list[Capability]:
from pydantic_ai.capabilities import Capability

gotchas_capability: Capability = Capability(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if you can make this dynamic from reading the references file. not necessary now, but maybe in future if we vendor this from marimo-pair

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 12 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="frontend/src/components/chat/chat-panel.tsx">

<violation number="1" location="frontend/src/components/chat/chat-panel.tsx:537">
P2: Potential loss of legacy context resolution for old persisted chats after removing `buildCompletionRequestBody`. The old helper re-scanned all historical message text for `@` references on every completion request (regenerate/retry). The new code only preserves context for messages that already contain the new `data-marimo-context` part. Old persisted chats without that part will silently lose `@` context on regeneration.</violation>
</file>

<file name="frontend/src/components/editor/ai/completion-utils.ts">

<violation number="1" location="frontend/src/components/editor/ai/completion-utils.ts:95">
P2: New `resolveChatContext` duplicates core parsing/attachment logic from `getAICompletionBodyWithAttachments`, creating drift risk. Only the new path stamps `providerMetadata.marimo.source = "context"`, so `isContextAttachment`-based filtering behaves inconsistently across chat and completion attachments. Extract a shared helper for the common registry parsing/attachment fetching logic.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

const completionBody = await buildCompletionRequestBody(
options.messages,
);
const completionBody = {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Potential loss of legacy context resolution for old persisted chats after removing buildCompletionRequestBody. The old helper re-scanned all historical message text for @ references on every completion request (regenerate/retry). The new code only preserves context for messages that already contain the new data-marimo-context part. Old persisted chats without that part will silently lose @ context on regeneration.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At frontend/src/components/chat/chat-panel.tsx, line 537:

<comment>Potential loss of legacy context resolution for old persisted chats after removing `buildCompletionRequestBody`. The old helper re-scanned all historical message text for `@` references on every completion request (regenerate/retry). The new code only preserves context for messages that already contain the new `data-marimo-context` part. Old persisted chats without that part will silently lose `@` context on regeneration.</comment>

<file context>
@@ -532,9 +534,10 @@ const ChatPanelBody = () => {
-        const completionBody = await buildCompletionRequestBody(
-          options.messages,
-        );
+        const completionBody = {
+          uiMessages: options.messages,
+          includeOtherCode: getCodes(""),
</file context>

* Resolve @-context for messages. They represent referenced
* datasets, variables, or other context from the user's prompt.
*/
export async function resolveChatContext(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: New resolveChatContext duplicates core parsing/attachment logic from getAICompletionBodyWithAttachments, creating drift risk. Only the new path stamps providerMetadata.marimo.source = "context", so isContextAttachment-based filtering behaves inconsistently across chat and completion attachments. Extract a shared helper for the common registry parsing/attachment fetching logic.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At frontend/src/components/editor/ai/completion-utils.ts, line 95:

<comment>New `resolveChatContext` duplicates core parsing/attachment logic from `getAICompletionBodyWithAttachments`, creating drift risk. Only the new path stamps `providerMetadata.marimo.source = "context"`, so `isContextAttachment`-based filtering behaves inconsistently across chat and completion attachments. Extract a shared helper for the common registry parsing/attachment fetching logic.</comment>

<file context>
@@ -50,6 +50,91 @@ export function getAICompletionBody({
+ * Resolve @-context for messages. They represent referenced
+ * datasets, variables, or other context from the user's prompt.
+ */
+export async function resolveChatContext(
+  input: string,
+): Promise<ResolvedChatContext> {
</file context>

@Light2Dark Light2Dark merged commit e55d04d into main Jun 26, 2026
48 checks passed
@Light2Dark Light2Dark deleted the sham/prompt-improvements branch June 26, 2026 02:14
@github-actions

Copy link
Copy Markdown

🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.23.12-dev3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bash-focus Area to focus on during release bug bash enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants