Skip to content

🎉 🤖 add file-first source for posts_gdocs#6545

Draft
mlbrgl wants to merge 8 commits into
graphite-base/6545from
claude-cms-file
Draft

🎉 🤖 add file-first source for posts_gdocs#6545
mlbrgl wants to merge 8 commits into
graphite-base/6545from
claude-cms-file

Conversation

@mlbrgl

@mlbrgl mlbrgl commented May 27, 2026

Copy link
Copy Markdown
Member

Context

Links to issues, Figma, Slack, and a technical introduction to the work.

Screenshots / Videos / Diagrams

Add if relevant, i.e. might not be necessary when there are no UI changes.

Testing guidance

Step-by-step instructions on how to test this change

  • Does the change work in the archive?
  • Does the staging experience have sign-off from product stakeholders?

Reminder to annotate the PR diff with design notes, alternatives you considered, and any other helpful context.

Checklist

(delete all that do not apply)

Before merging

  • Google Analytics events were adapted to fit the changes in this PR
  • Changes to CSS/HTML were checked on Desktop and Mobile Safari at all three breakpoints
  • Changes to HTML were checked for accessibility concerns

If DB migrations exists:

  • If columns have been added/deleted, all necessary views were recreated and ETL and Analytics team members have been informed of the incoming changes
  • The DB type definitions have been updated
  • The DB types in the ETL have been updated
  • Update the documentation in db/docs

After merging

  • If a table was touched that is synced to R2, the sync script to update R2 has been run

mlbrgl and others added 8 commits May 27, 2026 22:19
Introduce a new storage source for gdocs so ArchieML content can live in a
local content repo instead of Google Docs. A new posts_gdocs.source column
('gdocs' | 'file', default 'gdocs') records where each gdoc's authoritative
content lives. GdocBase.fetchFromFile reads the matching .md under
CONTENT_REPO_PATH (identified by the first 12 chars of the gdoc id) and runs
it through archieToEnriched — the same enrichment pipeline used for Google
Docs fetches.

loadGdocFromGdocBase routes transparently: when a caller passes
contentSource=Gdocs and the row's storage source is 'file', the file path
is taken. Callers (e.g. the admin client) don't need to discover the source
ahead of time. Storage and fetch concerns are expressed via a new
GdocStorageSource literal subset of GdocsContentSource.

All seven site/gdocs/pages/*.tsx Omit types exclude 'source' along with the
other backend-only fields (contentMd5, markdown, publicationContext,
revisionId). gdocsDeploy's lightning-prop config flags 'source' as
rendering-safe, so changes to it don't force a full site rebake.

The existing Google Docs path is unchanged for rows with source='gdocs' —
no behavioural difference, just a new routing branch for files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two thin scripts that bracket the file-first lifecycle:

- devTools/gdocs/fetch-all-to-files.ts: fetches every posts_gdocs row from
  Google Docs, converts the AST to ArchieML via gdocToArchie, and writes it
  under <CONTENT_REPO_PATH>/<type>/<slug>--<shortId>.md. On success it also
  flips the row's source to 'file' — exporting IS the migration. Supports
  --dry-run, --type, --id, --concurrency filters.

- devTools/gdocs/create-content.ts: scaffolds a new piece of content
  (article | data-insight | topic-page | linear-topic-page | fragment).
  Generates a uuidv7 id, writes a minimal valid ArchieML scaffold, inserts
  a posts_gdocs row with source='file', and prints the admin preview URL.
  Replaces the admin's "Add a document" flow when there's no Google Doc to
  back the new piece.

Together these cover the two on-ramps for the file-first workflow:
migrating existing Gdoc content across, and creating new pieces from
scratch. Day-to-day editing then happens in the file directly; the admin
re-reads on every preview refresh via GdocBase.fetchFromFile.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two affordances on the gdoc preview page so authors/devs can tell at a
glance where content is being read from, and jump straight into editing
it:

- A coloured "File" / "Google Docs" tag next to the title, with a tooltip
  explaining what each source implies for the refresh loop.
- The existing "[ Edit ]" link becomes source-aware:
  - source='gdocs' → unchanged (links to the Google Doc edit URL).
  - source='file'  → replaced by two links: "Edit in Claude" (opens a
    Claude Code session via claude://code/new scoped to the content repo,
    with a prompt that nudges the agent toward the owid-content skill and
    the component registry), plus the relative file path as click-to-copy
    fallback.

CONTENT_REPO_PATH is newly exposed via clientSettings so the admin client
can build the absolute file path the claude:// deep-link expects. When the
env var isn't set, the "Edit in Claude" button is suppressed and only the
relative-path copy button remains.

No behaviour change for google-backed gdocs: the "Edit" link on those
rows still points at docs.google.com as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit rendered two action links for file-backed gdocs
(Edit in Claude + the relative file path as a copy button). The path
was long enough that the header wrapped to two lines and pushed the
Draft tag off the first row.

Collapse to a single "Edit in Claude" link, matching the visual weight
of the "Edit" link used for Google Docs-backed rows. The file path
moves into the tooltip — still visible on hover, no longer taking
horizontal space.

A copy-path fallback is preserved for sessions where CONTENT_REPO_PATH
isn't configured on the client (no claude:// URL can be built).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
For authors who prefer VS Code to Claude Code, the file-backed edit
area now renders two pipe-separated links inside the existing bracket
decoration: [ Edit in Claude ✏️ | VS Code ]. The VS Code link uses the
well-known vscode://file/<abs-path> deep-link scheme, built from the
same absPath already computed for the claude:// URL.

Both share the path-in-tooltip convention so the header stays on one
line (header overflow was the regression that landed as 💄 in the
previous commit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

mlbrgl commented May 27, 2026

Copy link
Copy Markdown
Member Author

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@owidbot

owidbot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs Docs Preview

Login: ssh owid@staging-site-claude-cms-file

Archive:

🎨 Bespoke dev server

SVG tester:

Number of differences (graphers): 0 ✅
Number of differences (grapher views): skipped
Number of differences (mdims): skipped
Number of differences (explorers): skipped
Number of differences (thumbnails): skipped

Edited: 2026-05-27 20:29:41 UTC
Execution time: 1.52 seconds

@github-actions

Copy link
Copy Markdown

This PR has had no activity within the last two weeks. It is considered stale and will be closed in 3 days if no further activity is detected.

@github-actions github-actions Bot added the stale label Jun 11, 2026
@mlbrgl mlbrgl changed the base branch from claude-cms-base to graphite-base/6545 June 12, 2026 19:12
@github-actions github-actions Bot removed the stale label Jun 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants