Skip to content

Cell loader dedents lines inside multiline string literals — runtime value differs from Python semantics #9851

Description

@dirkenglund

Bug

When a cell defines a multiline (triple-quoted) string whose content lines are indented, marimo's file loader dedents the cell body by the function indent including lines inside the string literal. The runtime value under marimo run / marimo edit then differs from what Python itself would produce when importing the same file.

Minimal repro (marimo 0.19.11, python 3.14)

nb.py:

import marimo

__generated_with = "0.19.11"
app = marimo.App()


@app.cell
def _():
    TEXT = """line0
  two_spaces
    four_spaces
      six_spaces"""
    return (TEXT,)
import importlib.util
from marimo._ast.app import InternalApp

spec = importlib.util.spec_from_file_location("nb", "nb.py")
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
cell = next(c for c in InternalApp(mod.app).cell_manager._cell_data.values()
            if "TEXT" in (c.code or ""))
ns = {}; exec(cell.code, ns)
print(repr(ns["TEXT"]))
# marimo:  'line0\n  two_spaces\nfour_spaces\n  six_spaces'
# python:  'line0\n  two_spaces\n    four_spaces\n      six_spaces'

Lines with a 4-space prefix lose exactly those 4 spaces ( four_spacesfour_spaces, six_spaces six_spaces); lines with fewer keep theirs. python nb.py / importing the module gives the original string.

Real-world impact

We embed Lean 4 proof source in notebook cells and send it to a verification API (lean-marimo-quantum). Lean is indentation-sensitive: tactic lines indented ≥4 spaces arrived at the kernel at column 0 and failed with expected '{' or indented tactic sequence — but only when served via marimo run; the same file imported by Python verified fine. Confusing to debug because the divergence is invisible in the source. Anything indentation-sensitive (YAML, Python source strings, Makefiles, Lean) hits this.

Workaround we adopted: keep multiline literals in a plain imported module rather than in cells.

Expected

Cell extraction should preserve string-literal content byte-for-byte (dedent decisions based on AST token positions rather than raw line prefixes), or marimo check/docs should at least warn about indented multiline strings in cells.

Implementation found by Claude (Anthropic) in collaboration with @dirkenglund.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions