audits: UX audit of Sphinx docs site (2026-05, audit-only) by igerber · Pull Request #429 · igerber/diff-diff

igerber · 2026-05-14T10:53:12Z

Summary

Audit-only deliverable evaluating diff-diff.readthedocs.io against a 9-category UX rubric on 12 Tier-1 pages (desktop + mobile) and comparing the current pydata-sphinx-theme against furo and sphinx-book-theme on a 5-page subset.

Full deliverable: audits/ux-2026-05/README.md. Methodology + per-category checks: audits/ux-2026-05/rubric.md. Per-page perf + console data: audits/ux-2026-05/perf/desktop/. Console-error roundup: audits/ux-2026-05/console-errors.md.

Top recommendations (full P0/P1/P2 list in the README):

Theme: stay on pydata-sphinx-theme; bump version floor to >=0.16.1 for the v0.16+ accessibility improvements (P1).
Mobile sidebar drawer flattens the toctree (drops desktop section headings); largest single mobile finding (P1).
Tutorials missing prev/next links at footer (P1).
4 methodology .md files in docs/methodology/ are not Sphinx-rendered (no myst-parser); orphaned from human-visitor navigation (P1).

Scope discipline (verification details in README section 7-10):

Zero changes to docs/, pyproject.toml, diff_diff/, rust/, tests/, .readthedocs.yaml, .gitignore, .github/. Only audits/ux-2026-05/** paths in this PR.
0 em-dashes in audit prose.
.playwright-mcp/ runtime artifacts excluded (not staged).

Methodology references

N/A - no methodology / math / estimator changes. This PR adds an audit deliverable only.

Validation

No test changes.
No source code changes.
Audit-doc PR triggers ai_pr_review.yml on open (no path filter; expected) and ci-gate.yml (will block until ready-for-ci label is added; user-driven). docs-tests.yml, notebooks.yml, rust-test.yml do NOT fire on audits/** paths.

Security / privacy

Confirm no secrets/PII in this PR: Yes. The screenshots are of the publicly deployed https://diff-diff.readthedocs.io/ site; no auth tokens, credentials, or non-public content.

Per the audit-only scope, all P0/P1/P2 implementation work goes into separate follow-up PRs after the recommendations are triaged.

Audit-only deliverable evaluating diff-diff.readthedocs.io against a 9-category UX rubric on 12 Tier-1 pages (desktop + mobile) and comparing the current pydata-sphinx-theme against furo and sphinx-book-theme on a 5-page subset. Key findings (full doc at audits/ux-2026-05/README.md): - Recommend staying on pydata-sphinx-theme; bump version floor to >=0.16.1 for the v0.16+ accessibility improvements (P1) - Highest-impact UX gap: mobile sidebar drawer flattens the toctree - drops section headings present on desktop sidebar (P1) - Tutorials lack prev/next links at the bottom; users have to scroll back up to find the next tutorial (P1) - 4 methodology .md files in docs/methodology/ are not Sphinx-rendered (no myst-parser); orphaned from human-visitor navigation (P1) - Performance is already strong: FCP <400ms on 11/12 pages, HTML decoded sizes 38-177 KB, 24-27 resources per page - AI-agent surface (4 llms.txt variants + Schema.org JSON-LD + sitemap) is exemplary No source/conf/CSS changes in this PR; recommendations are queued as separate follow-up PRs after triage. Captured via Microsoft Playwright MCP server. ~50 screenshots stored as WebP (most) + JPG (long pages exceeding WebP's 16383px height limit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-14T10:57:30Z

Overall assessment

✅ Looks good

This PR does not change any estimator, weighting, variance/SE path, identification assumption, or default behavior. I cross-checked the repo state against docs/methodology/REGISTRY.md; the only reviewable risk here is whether the new audit artifact is accurate and safe for follow-up work.

Executive summary

The methodology surface is unchanged, but the audit text currently mislabels the non-rendered methodology markdown as “orphaned” and briefly suggests deleting it, even though those files are tracked elsewhere in the repo as active methodology dependencies.
Two evidence summaries overstate what the raw captures show: the EthicalAds ERR_ABORTED issue is described as happening on every page, and the timing JSONs are described as uniform even though 01-index-timing.json uses a different schema and omits resourceCount.
The rest of the audit mostly lines up with the current repo: pydata-sphinx-theme>=0.15 is still the floor, myst-parser is absent, and the docs/methodology/*.md files do exist without Sphinx rendering.
I found one minor artifact hygiene issue: an extra duplicate detail screenshot is committed but not referenced by the report.

Methodology

P2 audits/ux-2026-05/README.md:L67-L70: The IA section says the methodology markdown files are “orphaned” and proposes “add myst-parser extension OR delete the orphaned methodology .md files.” That deletion framing is misleading: these files are still first-class methodology sources in the repo, e.g. docs/doc-deps.yaml:L334-L343 and docs/doc-deps.yaml:L695-L703 declare docs/methodology/continuous-did.md and docs/methodology/survey-theory.md as methodology dependencies, and diff_diff/guides/llms-practitioner.txt:L501-L505 points readers to docs/methodology/REPORTING.md. Impact: a follow-up PR could treat deletion as an endorsed cleanup path and remove source material the project relies on for methodology/documentation alignment. Concrete fix: reword this to “not Sphinx-rendered for human visitors” and keep the allowed follow-ups to “render” or “relocate with updated references,” not delete.

Code Quality

No findings. This PR adds audit artifacts only; no executable code paths changed.

Performance

P2 audits/ux-2026-05/README.md:L197-L198 and audits/ux-2026-05/console-errors.md:L11-L11, L34-L36: Both summaries say the EthicalAds net::ERR_ABORTED requests occur on every page / two per page, but the raw captures for at least audits/ux-2026-05/perf/desktop/02-quickstart-network.txt:L1-L26, 10-api-autosummary-CallawaySantAnna-network.txt:L1-L25, and 11-references-network.txt:L1-L26 contain no failed requests at all. Impact: the audit overstates the consistency of the RTD/EthicalAds issue, which weakens trust in the evidence table and could mis-prioritize follow-up work. Concrete fix: revise the summary to match the raw captures exactly, e.g. “observed on 9 of 12 pages,” or regenerate the affected runs if the intent was to normalize capture conditions.
P2 audits/ux-2026-05/README.md:L26-L26 and L174-L189 vs. audits/ux-2026-05/perf/desktop/01-index-timing.json:L1-L9: The README describes a uniform per-page timing JSON containing transfer size and resource count, but 01-index-timing.json uses a different key set (transferSize_KB, decodedBodySize_KB, FP_ms) and omits resourceCount; the table then inserts (24 baseline) for index resources even though that value is not present in the raw JSON. Impact: any downstream analysis has to special-case the first page, and the report currently mixes recorded metrics with inferred ones without saying so. Concrete fix: regenerate or normalize 01-index-timing.json to the same schema as the other 11 files, or explicitly annotate the index row as manually reconstructed from the network log.

Maintainability

P3 audits/ux-2026-05/README.md:L298-L304: The appendix says there are 6 detail crops, but the PR also adds audits/ux-2026-05/screenshots/current/details/02-mobile-search-keypress.webp, which is byte-identical to 02-mobile-search-modal-empty.webp and is not referenced anywhere in the report. Impact: minor repo bloat/confusion in what is supposed to be a fixed evidence set. Concrete fix: remove the duplicate file or add a sentence explaining why both names are intentionally kept.

Tech Debt

No new blocker-level tech debt. The only notable debt introduced here is the small evidence-hygiene issue above.

Security

No findings. I did not see secrets, credentials, or non-public content in the added artifacts.

Documentation/Tests

P3 audits/ux-2026-05/console-errors.md:L9-L10 vs. audits/ux-2026-05/perf/desktop/01-index-console.txt:L1-L2 and 02-quickstart-console.txt:L1-L2: The summary says “2-4 console messages per page were emitted at lower severity,” but the raw captures show 4 on the index page and 5 on most others. Impact: minor factual drift in the report text. Concrete fix: update the range to 4-5, or just avoid a range and report the exact per-page counts already present in the raw files.

igerber closed this May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audits: UX audit of Sphinx docs site (2026-05, audit-only)#429

audits: UX audit of Sphinx docs site (2026-05, audit-only)#429
igerber wants to merge 1 commit into
mainfrom
docs/ux-audit-2026-05

igerber commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented May 14, 2026

Summary

Methodology references

Validation

Security / privacy

Uh oh!

github-actions Bot commented May 14, 2026

Overall assessment

Executive summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant