Fix: expand dict transformation by Shamik-07 · Pull Request #9561 · marimo-team/marimo

Shamik-07 · 2026-05-15T22:27:25Z

📝 Summary

Using narwahls to convert all backend to polars and then using the unnest function of polars for expanding the dict and then convert it back to the original backend.
Closes #4583

Screen.Recording.2026-05-15.at.18.07.461.mov

📋 Pre-Review Checklist

For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
Any AI generated code has been reviewed line-by-line by the human PR author, who stands by it.
Video or media evidence is provided for any visual changes (optional).

✅ Merge Checklist

I have read the contributor guidelines.
Documentation has been updated where applicable, including docstrings for API changes.
Tests have been added for the changes made.

…mes.

vercel · 2026-05-15T22:27:30Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marimo-docs	Ready	Preview, Comment	Jul 3, 2026 3:32pm

cubic-dev-ai

1 issue found across 3 files

Architecture diagram

sequenceDiagram
    participant User as User (Marimo UI)
    participant DFPlugin as Dataframe Plugin
    participant Handler as ExpandDict Handler (handlers.py)
    participant Narwhals as Narwhals Layer
    participant Polars as Polars Engine
    participant Backend as Original Backend (pandas/Ibis/other)

    Note over User,Backend: Expand Dict Transformation Flow

    User->>DFPlugin: Trigger expand dict on column
    DFPlugin->>Handler: handle_expand_dict(df, transform)

    Handler->>Narwhals: collect_and_preserve_type(df)
    Narwhals->>Backend: Collect actual data from original backend
    Backend-->>Narwhals: Data as native type
    Narwhals-->>Handler: Collected DataFrame + undo function

    Handler->>Polars: collected_df.to_polars()
    Note over Handler,Polars: Convert to Polars for unnest support

    Polars->>Polars: polars_df.unnest(column_id)
    Note over Polars: Handles null dict values correctly

    Polars-->>Handler: Unnested Polars DataFrame

    Handler->>Narwhals: nw.from_native(unnested)
    Narwhals->>Handler: Narwhals wrapper

    Handler->>Handler: undo(narwhals_df)
    Note over Handler: Convert back to original backend type

    Handler-->>DFPlugin: Transformed DataFrame
    DFPlugin-->>User: Updated table with expanded columns

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Re-trigger cubic}

mscolnick

need to take an optional dep on polars

… rows with the create test dataframes instead.

refactor: adding None == NaN in assert frame equal with nans method to use it in expand_dict test.

Shamik-07 · 2026-05-19T22:30:16Z

need to take an optional dep on polars

Done.
Added allow_none_equals_nan for assert_frame_equal_with_nans as None!=NaN, which was causing the use of assert_frame_equal_with_nans to fail for test_expand_dict.

Shamik-07 · 2026-05-19T23:12:49Z

There are some pandas CI errors that i am looking into.

… errors for mixed object columns.

Shamik-07 · 2026-05-19T23:30:00Z

There are some pandas CI errors that i am looking into.

This is happening because of data conversion mismatch between pandas and narwahls with mixed data types

"Could not convert '3' with type str: tried to convert to double"

for

test_print_code_result_matches_actual_transform_pandas(
    transform=ExpandDictTransform(
        type=TransformType.EXPAND_DICT,
        column_id='strings',
    ),
)

So my only option is to fallback to pandas backend processing separately for the unnest.
This should fix it.

cubic-dev-ai

1 issue found across 1 file (changes from recent commits).

_{Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic}

… json normalise following the handlers code.

codecov · 2026-05-20T21:22:55Z

Bundle Report

Bundle size has no change ✅

Affected Assets, Files, and Routes:

view changes for bundle: marimo-esm

Assets Changed:

Asset Name	Size Change	Total Size	Change (%)
`assets/dist-*.js`	-107 bytes	169 bytes	-38.77%
`assets/dist-*.js`	72 bytes	176 bytes	69.23% ⚠️
`assets/dist-*.js`	-76 bytes	183 bytes	-29.34%
`assets/dist-*.js`	-52 bytes	335 bytes	-13.44%
`assets/dist-*.js`	-152 bytes	104 bytes	-59.38%
`assets/dist-*.js`	35 bytes	137 bytes	34.31% ⚠️
`assets/dist-*.js`	-60 bytes	104 bytes	-36.59%
`assets/dist-*.js`	107 bytes	276 bytes	63.31% ⚠️
`assets/dist-*.js`	60 bytes	164 bytes	57.69% ⚠️
`assets/dist-*.js`	-144 bytes	259 bytes	-35.73%
`assets/dist-*.js`	87 bytes	256 bytes	51.48% ⚠️
`assets/dist-*.js`	227 bytes	403 bytes	128.98% ⚠️
`assets/dist-*.js`	23 bytes	160 bytes	16.79% ⚠️
`assets/dist-*.js`	40 bytes	177 bytes	29.2% ⚠️
`assets/dist-*.js`	-233 bytes	102 bytes	-69.55%
`assets/dist-*.js`	210 bytes	387 bytes	118.64% ⚠️
`assets/dist-*.js`	-14 bytes	169 bytes	-7.65%
`assets/dist-*.js`	-23 bytes	137 bytes	-14.37%

cubic-dev-ai · 2026-05-30T01:03:02Z

@cubic-dev-ai

@kirangadhave I have started the AI code review. It will take a few minutes to complete.

Copilot

Pull request overview

This PR makes the dataframe Expand Dict transform robust to nulls by routing expansion through backend-native implementations (Polars unnest and Pandas json_normalize) and adds/updates tests to validate the behavior, including nested dict values.

Changes:

Update runtime transform handling to expand dict/struct columns using Pandas-native logic for pandas inputs and Polars unnest otherwise.
Update generated “print code” for Expand Dict in pandas and polars to match the new implementations.
Expand test coverage for Expand Dict with nulls and nested dicts; adjust equality helper to optionally treat None and NaN as equivalent.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
marimo/_plugins/ui/_impl/dataframes/transforms/handlers.py	Implements Expand Dict via pandas `json_normalize` or Polars `unnest` after collection.
marimo/_plugins/ui/_impl/dataframes/transforms/print_code.py	Updates printed code generation for Expand Dict for pandas and polars.
tests/_plugins/ui/_impl/dataframes/test_handlers.py	Unskips/extends Expand Dict tests (nulls + nested dicts) and tweaks dataframe comparison helper.
tests/_plugins/ui/_impl/dataframes/test_print_code.py	Adds print-code parity tests for Expand Dict with nested dicts for pandas and polars.

        result = apply(df, in_transform)
        assert_frame_equal_with_nans(result, expected)

-    @staticmethod
-    @pytest.mark.parametrize(
-        ("df", "expected"),
-        list(
-            zip(
-                create_test_dataframes(
-                    {"nulls": [1, 2, 3, None, "hello"]}, include=["pandas"]
-                ),
-                create_test_dataframes({"nulls": [None]}, include=["pandas"]),
-                strict=False,
-            )
-        ),
-    )
-    def test_filter_rows_null_pandas_object(
-        df: DataFrameType, expected: DataFrameType
-    ) -> None:
-        in_transform = FilterRowsTransform(
-            type=TransformType.FILTER_ROWS,
-            operation="keep_rows",
-            where=FilterGroup(
-                type="group",
-                operator="and",
-                children=[
-                    FilterCondition(
-                        type="condition",
-                        column_id="nulls",
-                        operator="in",
-                        value=[None],
-                    )
-                ],
-            ),
-        )
-        result = apply(df, in_transform)
-        assert_frame_equal_with_nans(result, expected)
-
    @staticmethod
    @pytest.mark.parametrize(


cubic-dev-ai

No issues found across 4 files

Architecture diagram

sequenceDiagram
    participant UI as DataFrame UI
    participant Handler as NarwhalsTransformHandler
    participant Narwhals as Narwhals Layer
    participant Backend as DataFrame Backend
    participant PrintCode as Print Code Generator

    Note over UI,PrintCode: Expand Dict Transform Flow - Current State

    UI->>Handler: handle_expand_dict(DataFrame, ExpandDictTransform)
    Handler->>Narwhals: collect_and_preserve_type(df)
    Narwhals-->>Handler: (collected_df, undo)
    Handler->>Narwhals: collected_df.to_native()
    Narwhals-->>Handler: native_df

    alt Pandas Backend
        Handler->>Handler: Check if pandas dataframe
        Handler->>Backend: result_df = native_df.copy()
        Handler->>Backend: expanded = pd.json_normalize(result_df.pop(column_id).map(...), max_level=0)
        Backend-->>Handler: expanded DataFrame
        Handler->>Backend: expanded.index = result_df.index
        Handler->>Backend: result_df.join(expanded)
        Backend-->>Handler: joined DataFrame
        Handler->>Narwhals: undo(nw.from_native(joined))
        Narwhals-->>Handler: original backend type
    else Polars Backend
        Handler->>Narwhals: collected_df.to_polars()
        Narwhals-->>Handler: polars_df
        Handler->>Backend: polars_df.unnest(column_id)
        Backend-->>Handler: unnested DataFrame
        Handler->>Narwhals: undo(nw.from_native(unnested))
        Narwhals-->>Handler: original backend type
    end
    Handler-->>UI: Transformed DataFrame

    Note over UI,PrintCode: Print Code Generation

    UI->>PrintCode: Generate Python code for transform
    PrintCode->>PrintCode: Check backend type

    alt Pandas Backend
        PrintCode->>PrintCode: Generate: df.join(pd.json_normalize(df.pop(col).map(...), max_level=0).set_axis(...))
    else Polars Backend
        PrintCode->>PrintCode: Generate: df.unnest(column_id)
    end
    PrintCode-->>UI: Generated code string

_{Re-trigger cubic}

kirangadhave · 2026-05-30T01:13:34Z

@Shamik-07 can you please update the video in the PR to a higher res version? I'm having difficulty reading text in it

Shamik-07 · 2026-06-01T21:42:20Z

@Shamik-07 can you please update the video in the PR to a higher res version? I'm having difficulty reading text in it

Unfortunately, due to GH upload limitations, i can't upload a high res video.
PFA the notebook instead.
issue_4583_expand_dict_null_values.py

kirangadhave

please address copilot review. We can do a manual round of reviews after that.

…dlers module.

Shamik-07 · 2026-06-02T21:24:56Z

please address copilot review. We can do a manual round of reviews after that.

Thanks, fixed.

kirangadhave

Requesting a major change which should remove hard dependency on polars for other backends which support structs.

Also please address other nits.

kirangadhave · 2026-06-30T06:14:44Z

+            expanded.index = result_df.index
+            return undo(nw.from_native(result_df.join(expanded)))
+
+        DependencyManager.polars.require(


non pandas dataframes are forced to go through polars conversion here. For duckdb or ibis, that means forcing polars installation. We should not do that.

Narwhals has struct.field, so we can do:

schema = df.collect_schema() fields = [f.name for f in schema[col].fields] df.with_columns( [nw.col(col).struct.field(f).alias(f) for f in fields] ).drop(col)

This approach also stays lazy. Pandas approach with json_serialize is correct.

Are you sure that all backends, which support struct schema would necessarily have struct.field?

kirangadhave · 2026-06-30T06:18:01Z

+            # older versions of pandas running on py310 otherwise CI will fail
+            expanded = pd.json_normalize(
+                result_df.pop(transform.column_id).map(
+                    lambda value: {} if value is None else value


Also check for NaN here

kirangadhave · 2026-06-30T06:20:01Z

+            import pandas as pd
+
+            result_df = native_df.copy()
+            # max_level=0 was used so that pandas doesn't recursively unnest dicts


the comment is narrating the code, simplify to explain the why instead.

kirangadhave · 2026-06-30T06:22:46Z

+                max_level=0,
+            )
+            expanded.index = result_df.index
+            return undo(nw.from_native(result_df.join(expanded)))


Duplicate column names after unnest will throw error here. Handle it gracefully.

kirangadhave · 2026-06-30T06:22:52Z

+            why="to expand dict/struct columns for non-pandas backends"
+        )
+        polars_df = collected_df.to_polars()
+        unnested = polars_df.unnest(transform.column_id)


Duplicate column names after unnest will throw error here. Handle it gracefully.

kirangadhave · 2026-06-30T06:23:28Z

 pd = pytest.importorskip("pandas")
-pytest.importorskip("polars")
 pytest.importorskip("pyarrow")
+pytest.importorskip("polars")


import order change is unnecessary

feat: raising duplicate columns error test: adding necessary tests for duplicate columns

Shamik-07 · 2026-07-03T15:32:57Z

@kirangadhave

Thanks for the review comments.
I have addressed your review comments.
I still have a question, are you sure that all backends, which support struct schema would necessarily have struct.field?

Shamik-07 added 4 commits May 15, 2026 17:52

fix: using polars unnest for all backend types.

838d75f

fix: update the expand dict print statement.

851e5b1

tests: enabling the expand dict test and adding the necessary datafra…

0ea6758

…mes.

tests: removing ibis skip in expand dict test.

294e21e

Shamik-07 mentioned this pull request May 15, 2026

New dataframe transform: Polars Native Expand Dict Transformation #4583

Open

vercel Bot deployed to Preview May 15, 2026 22:28 View deployment

cubic-dev-ai Bot reviewed May 15, 2026

View reviewed changes

Comment thread marimo/_plugins/ui/_impl/dataframes/transforms/handlers.py Outdated

vercel Bot deployed to Preview May 15, 2026 22:31 View deployment

mscolnick reviewed May 18, 2026

View reviewed changes

Comment thread tests/_plugins/ui/_impl/dataframes/test_handlers.py

mscolnick requested changes May 18, 2026

View reviewed changes

Shamik-07 added 3 commits May 19, 2026 17:45

Merge branch 'main' into fix/expand_dict_transformation

0d1e52f

fix: removing additional polars dataframe creation and using the none…

24f672b

… rows with the create test dataframes instead.

fix: using polars as optional in test handlers.

4e82164

refactor: adding None == NaN in assert frame equal with nans method to use it in expand_dict test.

vercel Bot deployed to Preview May 19, 2026 22:29 View deployment

Shamik-07 requested a review from mscolnick May 19, 2026 22:30

fix: processing pandas backend separately to not cause arrow coercion…

88ce223

… errors for mixed object columns.

vercel Bot deployed to Preview May 19, 2026 23:30 View deployment

cubic-dev-ai Bot reviewed May 19, 2026

View reviewed changes

Comment thread marimo/_plugins/ui/_impl/dataframes/transforms/handlers.py

Shamik-07 added 2 commits May 19, 2026 19:39

fix: mypy error.

1f722b8

fix: changing the expand dict print code function for pandas to using…

ff063e5

… json normalise following the handlers code.

vercel Bot deployed to Preview May 19, 2026 23:50 View deployment

Merge branch 'main' into fix/expand_dict_transformation

eb5ed5f

vercel Bot deployed to Preview May 20, 2026 21:22 View deployment

Merge branch 'main' into fix/expand_dict_transformation

6e16445

vercel Bot deployed to Preview May 21, 2026 21:14 View deployment

Copilot AI reviewed May 30, 2026

View reviewed changes

cubic-dev-ai Bot reviewed May 30, 2026

View reviewed changes

Light2Dark requested a review from kirangadhave June 1, 2026 08:03

Merge branch 'main' into fix/expand_dict_transformation

832133e

vercel Bot deployed to Preview June 1, 2026 21:45 View deployment

Merge branch 'main' into fix/expand_dict_transformation

850721f

vercel Bot deployed to Preview June 2, 2026 17:58 View deployment

kirangadhave requested changes Jun 2, 2026

View reviewed changes

Shamik-07 added 3 commits June 2, 2026 17:12

test: adding back the test_filter_rows_null_pandas_object in test_han…

98d3a8b

…dlers module.

fix: revert python3 to python in precommit config.

85035c3

fix: requiring polars for expand dict from polars.

bf7db87

vercel Bot deployed to Preview June 2, 2026 21:23 View deployment

Shamik-07 requested a review from kirangadhave June 2, 2026 21:24

Merge branch 'main' into fix/expand_dict_transformation

12f341d

vercel Bot deployed to Preview June 2, 2026 22:12 View deployment

Merge branch 'main' into fix/expand_dict_transformation

de0e372

vercel Bot deployed to Preview June 3, 2026 21:10 View deployment

kirangadhave requested changes Jun 30, 2026

View reviewed changes

Merge branch 'main' into fix/expand_dict_transformation

65176f2

vercel Bot deployed to Preview July 2, 2026 20:04 View deployment

Shamik-07 added 2 commits July 3, 2026 11:17

fix: removed polars dependency in expand dict

2a349d4

feat: raising duplicate columns error test: adding necessary tests for duplicate columns

Merge branch 'main' into fix/expand_dict_transformation

992ad9f

vercel Bot deployed to Preview July 3, 2026 15:20 View deployment

fix: raising an error at the end of handle expand dict.

be23303

vercel Bot deployed to Preview July 3, 2026 15:32 View deployment

Uh oh!

Conversation

Shamik-07 commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Summary

📋 Pre-Review Checklist

✅ Merge Checklist

Uh oh!

vercel Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mscolnick left a comment

Choose a reason for hiding this comment

Uh oh!

Shamik-07 commented May 19, 2026

Uh oh!

Shamik-07 commented May 19, 2026

Uh oh!

Shamik-07 commented May 19, 2026

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bundle Report

Affected Assets, Files, and Routes:

Assets Changed:

Uh oh!

cubic-dev-ai Bot commented May 30, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

kirangadhave commented May 30, 2026

Uh oh!

Shamik-07 commented Jun 1, 2026

Uh oh!

kirangadhave left a comment

Choose a reason for hiding this comment

Uh oh!

Shamik-07 commented Jun 2, 2026

Uh oh!

kirangadhave left a comment

Choose a reason for hiding this comment

Uh oh!

kirangadhave Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Shamik-07 Jul 2, 2026

Choose a reason for hiding this comment

Uh oh!

kirangadhave Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

kirangadhave Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

kirangadhave Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

kirangadhave Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

kirangadhave Jun 30, 2026

Choose a reason for hiding this comment

Shamik-07 commented May 15, 2026 •

edited

Loading

vercel Bot commented May 15, 2026 •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading

codecov Bot commented May 20, 2026 •

edited

Loading