🎉 Charts as first-class citizens in ETL: layered approach with grapher#6174
🎉 Charts as first-class citizens in ETL: layered approach with grapher#6174pabloarosado wants to merge 26 commits into
Conversation
Lets ETL push standalone Grapher charts as zero-dimension mdim collections. A `.config.yml` with `dimensions: []` and exactly one view is routed through `Collection.upsert_to_db` to a new endpoint that writes the chart's ETL-authored grapher config into `chart_configs.etlConfig` (added on the grapher side). The endpoint lives on owid-grapher: PUT /admin/api/charts/:id/etlConfig Admin patches in `chart_configs.patch` are preserved across ETL re-pushes by construction — ETL and admin write to different columns. No drift-check apparatus, no bot-user stamping, no ETL_FORCE_CHART. The full rendered config is recomputed server-side as merge(variableETL, etlConfig, patch). Contents: - `apps/chart_sync/admin_api.py`: new `put_chart_etl_config` and `delete_chart_etl_config` methods. - `etl/collection/chart_upsert.py`: new module that translates a zero-dim collection view into a grapher chart config (resolving catalog paths to variable IDs) and pushes it via the new endpoint. Creates the chart with a minimal bootstrap if it doesn't exist yet. - `etl/collection/model/core.py`: `Collection.upsert_to_db` short- circuits to `upsert_collection_as_chart` when `dimensions` is empty. - `etl/steps/__init__.py`: `ExportStep` can run from a `.config.yml` with no Python script for `export://multidim/...` paths. - `apps/step_update/cli.py` + `etl/version_tracker.py`: recognise YAML-only export steps in step discovery and version tracking. - `schemas/multidim-schema.json`: documents that `dimensions` can be empty for single-chart collections (the schema already allowed it structurally — `minItems` was never set). - `etl/steps/export/multidim/animal_welfare/latest/banning_of_chick_culling.config.yml` + DAG entry: chick-culling chart as the example/test artifact. Depends on owid-grapher PR #6553 (the endpoint and the etlConfig column don't exist on master yet). To test end-to-end on staging: SSH into the staging server, check out `etl-chart-config` in the owid-grapher folder, run the migration, restart admin, then run this ETL step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Quick links (staging server):
Login: chart-diff: ✅No charts for review.data-diff: ✅ No differences foundAutomatically updated datasets matching excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included Edited: 2026-06-09 14:51:16 UTC |
- `Collection.upsert_to_db`: when dimensions are empty, explicitly require exactly one view before dispatching to chart upsert. Previously the assertion lived only inside `upsert_collection_as_chart`; surfacing it at the dispatch point makes the implicit "zero-dim ⇒ single chart" behaviour switch concrete with a clear error for the malformed case. - `dag/animal_welfare.yml`: rewrite the test-scaffolding-flavoured comment on the chick-culling export step to a normal one-liner. Resolves codex findings 4 and 5 from the PR review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DAG entries for `export://multidim/...` steps backed by a `.config.yml` (single-chart or mdim collections without a `.py`) were not clickable because the resolver only looked for `.py`, `__init__.py`, or `.ipynb`. Add `.config.yml` to the candidate list for `export://` URIs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single-chart `.config.yml` files (zero-dim collection + no sibling `.py`)
push to the chart admin endpoint and are created unpublished by default.
The public `/grapher/{slug}` route doesn't render unpublished charts, so
clicking "Preview chart" was failing for newly-created ETL-authored
charts. Detect the single-chart case and route the preview through
`/admin/grapher/{slug}` instead, which renders drafts.
`parseExportMultidim` becomes async and returns `isChart` + `chartSlug`
based on (a) `dimensions: []` in the .config.yml AND (b) no sibling
`.py` file (multi-dim steps like air_pollution use `dimensions: []` as
a placeholder filled programmatically — those still go via the mdim
route).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`OWIDEnv.admin_site` already ends in `/admin`, so prefixing `/admin/charts/...` produced URLs like `http://.../admin/admin/charts/9138/edit`. Drop the duplicated segment; matches the convention used by `datasets_admin_site` etc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `/grapher/<slug>.png` route only renders published charts. For drafts it 404s — which is the common case while iterating on an ETL-authored chart. owid-grapher already exposes `/grapher/by-uuid/<chart_configs.id>.png` which serves any chart from R2's `byUUID/` directory, published or not. Adds `get_chart_png_url.py`: takes a chart slug or numeric chart id, queries the staging DB via `OWID_ENV.read_sql` to resolve it to the `chart_configs.id` UUID, prints the by-uuid PNG URL. Accepts extra grapher query params via `--key=value` flags (tab, time, country, …). Updates SKILL.md to recommend this as the primary path; keeps the agent-browser path as a fallback for MDIM data pages and interactive UI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous content documented a `.chart.yml` system in `etl/steps/graph/` that was an unmerged experiment and no longer exists in the repo. Rewrite the skill to cover the actual Phase 1 workflow: single ETL-authored charts as zero-dim mdim collections in `etl/steps/export/multidim/`, pushed via `chart_configs.etlConfig`, previewed via the `by-uuid` PNG route. Covers: file layout, minimum viable config, DAG entry, indicator catalog paths + display.name overrides, common chart-config edits, push command, preview via the check-chart-preview skill, and the admin co-editing model. Adds a "when NOT to use" section pointing at create-multidim for charts with dropdowns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The hens-by-housing-system chart was an iteration experiment, not a canonical reference. The chick-culling chart is the committed Phase 1 example in this repo, so cite only that. Replace the eggs-and-hens indicator paths in the display-name example with placeholders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d4954b3a57
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Adds “single-chart” support to the ETL multidim export pipeline by treating a zero-dimension collection (dimensions: [] with exactly one view) as a regular Grapher chart, pushed via a layered ETL-config endpoint so admin edits can override ETL defaults. This also updates tooling (ETL step execution, version tracking, VSCode extensions, and Claude skills) to support YAML-only export steps and draft chart preview flows.
Changes:
- Route zero-dimension multidim collections through a new chart upsert path (create/update chart + PUT chart
etlConfig) instead of the MDIM endpoint. - Allow YAML-only
export://multidim/...steps to execute without a.pyfile (and make tooling resolve.config.ymlas the “script”). - Update VSCode extensions + Claude skills to preview draft charts correctly and to link DAG steps to
.config.ymlwhen needed.
Reviewed changes
Copilot reviewed 17 out of 19 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
vscode_extensions/clickable-dag-steps/src/extension.ts |
Adds .config.yml as a resolvable target for YAML-only export steps. |
vscode_extensions/clickable-dag-steps/package.json |
Bumps extension version to 0.0.5. |
vscode_extensions/clickable-dag-steps/CHANGELOG.md |
Documents YAML-only export step support (but version history has a gap). |
vscode_extensions/chart-preview/src/extension.ts |
Adds chart-vs-mdim classification for export/multidim and uses admin preview routes for drafts. |
vscode_extensions/chart-preview/package.json |
Bumps extension version to 0.0.2. |
schemas/multidim-schema.json |
Updates schema guidance to explicitly allow dimensions: [] for single-chart mode. |
etl/version_tracker.py |
Allows get_path_to_script to resolve .config.yml for export/multidim steps. |
etl/steps/export/multidim/animal_welfare/latest/banning_of_chick_culling.config.yml |
Adds a concrete example of an ETL-authored single-chart collection. |
etl/steps/__init__.py |
Enables YAML-only export/multidim execution via ExportStep fallback runner. |
etl/collection/model/core.py |
Routes zero-dimension collections to chart upsert; rejects zero-dim collections with multiple views. |
etl/collection/chart_upsert.py |
Implements chart creation/update + ETL config push for single-chart collections. |
dag/animal_welfare.yml |
Wires the example export step into the DAG. |
apps/step_update/cli.py |
Makes step_update copy YAML-only export steps (no .py) correctly. |
apps/chart_sync/admin_api.py |
Adds AdminAPI methods to PUT/DELETE chart etlConfig. |
.claude/skills/check-chart-preview/SKILL.md |
Updates preview workflow, emphasizing by-UUID PNG rendering for drafts. |
.claude/skills/check-chart-preview/get_chart_png_url.py |
Adds helper to resolve slug/id → by-UUID PNG URL on staging. |
.claude/skills/chart-editing/SKILL.md |
Updates “chart-editing” skill docs to target single-chart .config.yml in export/multidim. |
- chart_upsert: set topic_tags on newly created ETL-authored charts (admin-owned after creation) - chart-preview ext: use fs.access for .py existence check instead of readFile - clickable-dag-steps: backfill 0.0.4 changelog entry Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…quired - Schema: drop `title`/`default_selection` from top-level required; drop `title_variant` from the title block's required. - Model: `Collection.title` and `Collection.default_selection` default to None. - Runtime: error when an mdim collection has no title; per-view warning when a non-map-only view has no entity selection fallback (neither collection-level `default_selection` nor per-view `selectedEntityNames`). - Example YAML + chart-editing skill: drop top-level `title:` / `default_selection:` from single-chart configs (they were ignored in chart-mode anyway). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Very nice! Worked a charm for me. A couple of ideas, but nothing prevents merging.
|
lucasrodes
left a comment
There was a problem hiding this comment.
Awesome, @pabloarosado! This is a great addition. I quickly scanned the code, and left some minor suggestions. But overall, nothing critical! Excited to try out this tool in PROD.
I've tried the workflow with Claude and it mostly worked fine. Below are some comments on the different stages and a final one on removing charts
Creating the chart
Fetch this branch and (pointing to the corresponding staging server) ask Claude to create a new chart (on whatever topic).
This worked fine! I prompted Claude with the following instruction
i want to test the work from this PR. I've been suggested
to create a chart to begin with, can you assist me w that?
I'd like a scatterplot that shows life expectancy vs
population.
And it generated a new YAML file in etl/steps/export/multidim/demography/latest/life_expectancy_vs_population.config.yml. It looked good.
On the flip side, I felt it was a bit slow (took ~4 minutes). I am not sure if this is expected, or maybe there is something we can improve in the SKILL to make this faster. From my understanding, the skill presumably in use (chart-editing) is focussed on editing charts. Should we adapt it to also have the appropriate context for creating charts?
Claude's full transcript
● Skill(chart-editing)
⎿ Successfully loaded skill
● Let me look at the current state of the graph steps and PR
work to understand the setup.
Searched for 1 pattern (ctrl+o to expand)
● Bash(git -C /home/lucas/repos/etl log --oneline master..HEAD
| head -20; echo ---; git -C /home/lucas/repos/etl diff
master...HEAD --stat | tail -30)
⎿ 8c6a5d037 Merge remote-tracking branch 'origin/maste
r' into refactor-migrate-co2
6b640f214 📊 Guinea Worm data update (#6178)
… +45 lines (ctrl+o to expand)
● Bash(git -C /home/lucas/repos/etl branch --show-current;
echo ---; git -C /home/lucas/repos/etl log --oneline -8;
echo ---; rg --files /home/lucas/repos/etl -g "*.ch…)
⎿ refactor-migrate-co2
work to understand the setup.
Searched for 1 pattern (ctrl+o to expand)
● Bash(git -C /home/lucas/repos/etl log --oneline master..HEAD | head -20; echo ---; git -C /home/lucas/repos/etl diff
master...HEAD --stat | tail -30)
⎿ 8c6a5d037 Merge remote-tracking branch 'origin/master' into refactor-migrate-co2
6b640f214 📊 Guinea Worm data update (#6178)
d9e42c81b Add women's rights, human rights, crime, media, social connection and technology indicators to Int
… +36 lines (ctrl+o to expand)
● Bash(git -C /home/lucas/repos/etl branch --show-current; echo ---; git -C /home/lucas/repos/etl log --oneline -8;
echo ---; rg --files /home/lucas/repos/etl -g "*.ch…)
⎿ refactor-migrate-co2
---
8c6a5d037 Merge remote-tracking branch 'origin/master' into refactor-migrate-co2
… +45 lines (ctrl+o to expand)
● Bash(git -C /home/lucas/repos/etl branch -a | rg -i "chart" ; echo ---; git -C /home/lucas/repos/etl log --oneline
-5 etl-chart-config 2>/dev/null; echo ---; git -C…)
⎿ bug-chart-diff-failure
bug-chart-diff-query
bug-remove-frame-chart
… +57 lines (ctrl+o to expand)
● The branch exists but I need to see what's actually in it — the paths the skill mentions didn't match. Let me check
what the PR changed:
● Bash(git -C /home/lucas/repos/etl diff master...etl-chart-config --stat | tail -40)
⎿ .claude/skills/chart-editing/SKILL.md | 230 +++++++++++----------
.claude/skills/check-chart-preview/SKILL.md | 40 ++--
.../check-chart-preview/get_chart_png_url.py | 78 +++++++
… +18 lines (ctrl+o to expand)
● Bash(git -C /home/lucas/repos/etl show
etl-chart-config:etl/steps/export/charts/animal_welfare/latest/banning_of_chick_culling.config.yml; echo
===DAG===; git -C /ho…)
⎿ Error: Exit code 128
fatal: path 'etl/steps/export/charts/animal_welfare/latest/banning_of_chick_culling.config.yml' does not exist in
'etl-chart-config'
(eval):1: ==DAG=== not found
● Bash(git -C /home/lucas/repos/etl show etl-chart-config:.claude/skills/chart-editing/SKILL.md)
⎿ ---
name: chart-editing
description: Create or edit an ETL-authored Grapher chart — a single-chart `.config.yml` in `etl/steps/expor
… +161 lines (ctrl+o to expand)
● Bash(git -C /home/lucas/repos/etl checkout etl-chart-config && git -C /home/lucas/repos/etl status --short | head)
⎿ Switched to branch 'etl-chart-config'
Your branch is up to date with 'origin/etl-chart-config'.
?? scripts/apply_softhyphen_prefixes.py
… +4 lines (ctrl+o to expand)
Listed 1 directory (ctrl+o to expand)
● Bash(rg "data://grapher/demography/2024-07-18/population|data://grapher/demography/2025-10-22/life_expectancy"
/home/lucas/repos/etl/dag/*.yml -l | head -5; echo ---…)
⎿ /home/lucas/repos/etl/dag/demography.yml
/home/lucas/repos/etl/dag/static_viz.yml
---LE-COLS---
… +19 lines (ctrl+o to expand)
Searched for 1 pattern (ctrl+o to expand)
● Bash(rg -A5 "grapher/demography/2024-07-18/population" /home/lucas/repos/etl/dag/demography.yml | head -10; echo
---; ls /home/lucas/repos/etl/etl/steps/data/grapher…)
⎿ Error: Exit code 2
data://grapher/demography/2024-07-18/population_doubling_times:
- data://garden/demography/2024-07-18/population_doubling_times
## Population comparison
data://grapher/demography/2023-07-03/world_population_comparison:
- data://garden/demography/2023-06-27/world_population_comparison:
---
population_doubling_times.py
__pycache__
● Branch switched — the skill list now reflects the PR version. Now let me pin down the population grapher step and
the exact indicator paths:
Listed 1 directory (ctrl+o to expand)
● Bash(.venv/bin/python -c "
from owid.catalog import Dataset…)
⎿ ['historical', 'population', 'population_density', 'population_growth_rate', 'projections']
['population', 'source', 'world_pop_share']
---
… +40 lines (ctrl+o to expand)
● Indicators identified. Let me check how existing configs handle scatterplots (especially the continents color
dimension):
Searched for 4 patterns (ctrl+o to expand)
● I have everything I need. Let me verify how config-only export steps work (whether a .py is needed) and check the
chick-culling DAG entry:
● Bash(ls /home/lucas/repos/etl/etl/steps/export/multidim/animal_welfare/latest/; echo ---DAG---; rg -n -B3 -A4
"multidim/animal_welfare" /home/lucas/repos/etl/dag/ani…)
⎿ banning_of_chick_culling.config.yml
---DAG---
112- #
… +55 lines (ctrl+o to expand)
● YAML-only config steps confirmed. Now let me look at where to add the DAG entry and check the multidim schema for
the scatter indicator shape:
● Bash(rg -n "life_expectancy" /home/lucas/repos/etl/dag/demography.yml | head; echo ---; tail -20
/home/lucas/repos/etl/dag/demography.yml; echo ---SCHEMA---; .venv/b…)
⎿ 184: data://grapher/demography/2025-10-22/life_expectancy:
185: - data://garden/demography/2025-10-22/life_expectancy:
208: data://grapher/oecd/2023-10-11/life_expectancy_birth:
… +89 lines (ctrl+o to expand)
Read 1 file (ctrl+o to expand)
● Now I'll create the chart config and DAG entry:
● Write(etl/steps/export/multidim/demography/latest/life_expectancy_vs_population.config.yml)
⎿ Wrote 31 lines to etl/steps/export/multidim/demography/latest/life_expectancy_vs_population.config.yml
1 topic_tags:
2 - "Life Expectancy"
3 dimensions: []
4 views:
5 - dimensions: {}
6 indicators:
7 y:
8 - catalogPath: "life_expectancy_at_birth#life_expectancy_0"
9 display:
10 name: "Life expectancy at birth"
… +21 lines (ctrl+o to expand)
● Update(dag/demography.yml)
⎿ Added 8 lines
188 - data://garden/demography/2025-10-22/life_tables
189 - data://garden/un/2024-07-12/un_wpp
190
191 + #
192 + # Life expectancy vs. population — chart authored in ETL.
193 + #
194 + export://multidim/demography/latest/life_expectancy_vs_population:
195 + - data://grapher/demography/2025-10-22/life_expectancy
196 + - data://grapher/demography/2024-07-15/population
197 + - data://grapher/regions/2023-01-01/regions
198 +
199 # Broken limits of Life Expectancy
200 data://grapher/demography/2025-10-22/broken_limits_le:
201 - data://garden/demography/2025-10-22/broken_limits_le:
● Both files are in place. Let me validate the DAG wiring with a quick dry-run (no execution):
● Bash(.venv/bin/etlr export://multidim/demography/latest/life_expectancy_vs_population --export --dry-run 2>&1 | tail
-15)
⎿ --- Detecting which steps need rebuilding...
OK (1.1s)
--- Would run 9 steps (at least 3m 51.1s):
… +9 lines (ctrl+o to expand)
● Everything resolves. The chart is ready to test. Here's what I set up:
What was created
1. Chart config — etl/steps/export/multidim/demography/latest/life_expectancy_vs_population.config.yml
A zero-dimension mdim (the PR's single-chart format) with one scatter view:
- y: life_expectancy_at_birth#life_expectancy_0 (OMM life expectancy, demography/2025-10-22/life_expectancy)
- x: population#population (demography/2024-07-15/population), log scale since population spans ~5 orders of
magnitude
- color: regions#owid_region — continent coloring, same pattern the COVID scatters use
2. DAG entry — dag/demography.yml:191 with the three grapher deps.
The dry-run validates fine — public slug will be life-expectancy-vs-population.
Next step (for you to run)
Some upstream deps aren't built locally (it wants to rebuild the hmd → life_tables chain, ~4 min), so fetch them
from the catalog instead:
PREFER_DOWNLOAD=1 .venv/bin/etlr export://multidim/demography/latest/life_expectancy_vs_population --export
--private
On success it prints admin_url=http://staging-site-etl-chart-config/admin/charts/<id>/edit.
Then for iterating on the YAML afterwards, the fast loop is:
.venv/bin/etlr export://multidim/demography/latest/life_expectancy_vs_population --export --force --only
Once it's pushed, I can fetch the PNG preview (via get_chart_png_url.py from the PR) so we can check the scatter
visually — let me know how the run goes.
✻ Cooked for 4m 14s
Previewing the chart
My understanding from Claude's output is that I need to execute the export step. So the preview is only possible after the chart exists in Admin. Just to confirm, did I understand this correctly?
After running the etlr command, I went ahead and typed /check-chart-preview.
I like the previewing chart a lot, but I think it'd be helpful at some point to get the URL to the chart. Maybe when etlr runs the export:// step and upserts a chart to the DB, so that it is easier to go to the live chart.
E.g. some message after the --- Finished export://multidim/demography/latest/life_expectancy_vs_population (2.8s), like chart available at staging-.../admin/charts/<chart_id>/edit.
Updating & syncing with Admin
Run the export step of the chart again, and check that the title is still the one manually set from the admin.
works perfectly! Also, if then i go and edit the title in ETL to match Admin's, then Admin recognizes that it is synced with ETL.
From the chart admin, link again the title field to etl, and check that the chart uses the etl values again.
Works!
Removing charts
One thing that came to mind is what should be the behavior when someone removes a chart in Admin. If that happens, and we don't do the same in ETL (e.g. archiving the chart step), the chart will appear again in PROD. Do you have an intuition on how we can solve this to avoid human errors? Somehow we should make ETL aware of the list of catalog paths of the "removed charts".
Maybe we need to add a new table in Grapher (e.g. charts_removed) that lists the IDs and chart paths of the removed charts (along with other fields like removedAt, removedBy, etc.). This is a bit related to the idea of having catalogPath in the charts table, which I shared in owid/owid-grapher#6553 (comment).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Hi Mojmir, thanks for the review! I agree with all your points:
|
validate_title() raised for Explorers (Collections with dimensions) even though explorers manage their own title, breaking test_explorer_migration. Gate both validations on _collection_type == "multidim". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Hi @lucasrodes, thanks for trying it out and for the feedback! Let me address your comments:
I'll have a look at the rest of your comments, thanks for the thorough review! |
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The admin etlConfig endpoint only checks the schema version, not structure, so a typo'd field or wrong type would be stored and silently render wrong. Validate the built config against the locally vendored grapher schema before pushing (no network), surfacing a clear error pointing at the offending field. Skips if the config's schema version isn't vendored locally. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* 🎉🤖 Key ETL charts by catalogPath (parallel to mdims) Persist the collection's catalog_path on charts.catalogPath (via the etlConfig endpoint) and look up the chart by it instead of by slug. This gives single charts the same stable ETL identity mdims already have (multi_dim_data_pages.catalogPath), with the slug left as the mutable public URL. Behavior change: ETL no longer adopts a hand-authored chart that merely shares a slug — it keys on catalogPath, so an accidental slug collision surfaces at publish time instead of silently overwriting. Pairs with owid-grapher PR (adds the charts.catalogPath column). Targets the chart-config feature branch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * 📜🤖 Fix put_chart_etl_config docstring for row-model storage The ETL config is stored in its own chart_configs row (charts.configIdETL), not in a chart_configs.etlConfig column. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ETL-authored charts are created independently on staging and prod, so they have different ids but a shared catalogPath. Match them across environments by catalogPath (falling back to id+createdAt for hand-authored charts), sync the etlConfig layer alongside the main config, and bootstrap a created chart with its etlConfig + catalogPath so ETL can adopt it. Approvals/conflicts for "catalog-path twins" (prod minted its own row) key on a NULL target timestamp. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ins) set_status recorded a catalog-path twin's approval with the prod chart's updatedAt, but _get_approvals looks twins up with a NULL target timestamp — so re-approving an existing ETL chart would lose the approval. Record via _target_updated_at_for_review so recording and lookup agree. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… columns Production's charts table doesn't have catalogPath/configIdETL yet (owid-grapher #6553 isn't merged to master), so chart-diff — which loads prod charts — failed with "Unknown column", breaking the staging build. Added a PROD_HAS_ETL_CHART_COLUMNS flag (default False): chart-diff falls back to id+createdAt matching and defers those columns out of the prod query. No practical loss yet (prod has no ETL/catalogPath charts). TODO: remove the flag + defer() once the grapher schema is in prod. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the PROD_HAS_ETL_CHART_COLUMNS flag with a runtime check of the target (production) DB's information_schema: chart-diff uses catalogPath matching only when charts.catalogPath/configIdETL actually exist there, otherwise it defers those columns out of the query and falls back to id+createdAt matching. The ETL PR merges after owid-grapher #6553, so prod will already have the columns by then and chart-diff switches over on its own — no flag to flip or code to touch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cache_all built a canonical, hashable form of the arguments and then called the
wrapped function with that canonical form instead of the originals. A DataFrame
argument therefore arrived as a ("__obj__", "DataFrame", ...) tuple, so e.g.
producer_analytics' _process_df crashed with "'tuple' object has no attribute
'groupby'" on the non-Streamlit (CLI/test) path. Now the canonical form is used
only as the cache key; the function is always called with the original args.
Pre-existing bug (from #5056), unrelated to the chart-config work — it just
happens to surface in the staging integration tests on this branch.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nction" This reverts commit 0b0bdc4.
… run test_chart_diff.py imported chart_diff at module top. pytest imports every test module during collection (even ones deselected by -m integration), and importing chart_diff applies its module-level @st_cache_data decorators, which call the @cache'd is_running_in_streamlit(). At collection time there's no Streamlit runtime, so it froze to False process-wide and routed producer_analytics' cached _process_df onto the buggy non-Streamlit path, crashing an unrelated integration test. Moving the import into the test bodies keeps collection clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ETL-authored charts now set isInheritanceEnabled=True explicitly when the chart row is first created, so grapher_config defined on an indicator flows into every chart built on it. Set only at creation, so existing charts (ETL or admin) are never touched. No owid-grapher change needed; the admin create endpoint already honors the inheritance flag. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Charts that pre-date ETL authorship have no catalogPath, so the catalogPath-only lookup failed against any DB where the step had never run (rebuilt staging, prod at merge time) and crashed on the slug collision with the existing chart. Fall back to the slug and adopt the chart: put_chart_etl_config stamps the catalog path, so later runs find it by catalogPath directly. Refuse to adopt a chart already owned by a different ETL step. Verified on staging: first run adopts chart 7118 with a byte-identical rendered config (admin map override preserved, no version bump); second run takes the regular catalogPath update path. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Human summary
Lets ETL create and edit charts as zero-dimension mdim collections, using a layered approach, so that changes made from the grapher chart admin supersede the ETL config.
Depends on owid/owid-grapher#6553.
Tracking issue: #6069
This PR is the etl-side of Phase 1 in the roadmap document.
Core implementation idea
A chart is defined in etl as an
export://multidimstep (with a.config.ymlfile) withdimensions: [](i.e. no dropdown selectors; not to be confused with the grapher config'sdimensions, which lists the plotted indicators) and exactly one view. This step gets routed throughCollection.upsert_to_dbto the newPUT /admin/api/charts/:id/etlConfigendpoint, which stores the chart's grapher config as a separatechart_configsrow, pointed to by a newcharts.configIdETLcolumn (mirroring howvariables.grapherConfigIdETLalready works).The rendered config is
merge(variableETL, etlConfig, patch): ETL writes theetlConfigrow, admins writepatch, so the two never collide and an ETL re-push can't clobber an admin edit. We also setcharts.catalogPathas the chart's stable ETL identity (like multi-dims already have).Additional changes
How to test