Skip to content

audits: UX audit of Sphinx docs site (2026-05, audit-only)#429

Closed
igerber wants to merge 1 commit into
mainfrom
docs/ux-audit-2026-05
Closed

audits: UX audit of Sphinx docs site (2026-05, audit-only)#429
igerber wants to merge 1 commit into
mainfrom
docs/ux-audit-2026-05

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 14, 2026

Summary

Audit-only deliverable evaluating diff-diff.readthedocs.io against a 9-category UX rubric on 12 Tier-1 pages (desktop + mobile) and comparing the current pydata-sphinx-theme against furo and sphinx-book-theme on a 5-page subset.

Full deliverable: audits/ux-2026-05/README.md. Methodology + per-category checks: audits/ux-2026-05/rubric.md. Per-page perf + console data: audits/ux-2026-05/perf/desktop/. Console-error roundup: audits/ux-2026-05/console-errors.md.

Top recommendations (full P0/P1/P2 list in the README):

  • Theme: stay on pydata-sphinx-theme; bump version floor to >=0.16.1 for the v0.16+ accessibility improvements (P1).
  • Mobile sidebar drawer flattens the toctree (drops desktop section headings); largest single mobile finding (P1).
  • Tutorials missing prev/next links at footer (P1).
  • 4 methodology .md files in docs/methodology/ are not Sphinx-rendered (no myst-parser); orphaned from human-visitor navigation (P1).

Scope discipline (verification details in README section 7-10):

  • Zero changes to docs/, pyproject.toml, diff_diff/, rust/, tests/, .readthedocs.yaml, .gitignore, .github/. Only audits/ux-2026-05/** paths in this PR.
  • 0 em-dashes in audit prose.
  • .playwright-mcp/ runtime artifacts excluded (not staged).

Methodology references

N/A - no methodology / math / estimator changes. This PR adds an audit deliverable only.

Validation

  • No test changes.
  • No source code changes.
  • Audit-doc PR triggers ai_pr_review.yml on open (no path filter; expected) and ci-gate.yml (will block until ready-for-ci label is added; user-driven). docs-tests.yml, notebooks.yml, rust-test.yml do NOT fire on audits/** paths.

Security / privacy

Confirm no secrets/PII in this PR: Yes. The screenshots are of the publicly deployed https://diff-diff.readthedocs.io/ site; no auth tokens, credentials, or non-public content.


Per the audit-only scope, all P0/P1/P2 implementation work goes into separate follow-up PRs after the recommendations are triaged.

Audit-only deliverable evaluating diff-diff.readthedocs.io against a
9-category UX rubric on 12 Tier-1 pages (desktop + mobile) and comparing
the current pydata-sphinx-theme against furo and sphinx-book-theme on a
5-page subset.

Key findings (full doc at audits/ux-2026-05/README.md):
- Recommend staying on pydata-sphinx-theme; bump version floor to
  >=0.16.1 for the v0.16+ accessibility improvements (P1)
- Highest-impact UX gap: mobile sidebar drawer flattens the toctree -
  drops section headings present on desktop sidebar (P1)
- Tutorials lack prev/next links at the bottom; users have to scroll
  back up to find the next tutorial (P1)
- 4 methodology .md files in docs/methodology/ are not Sphinx-rendered
  (no myst-parser); orphaned from human-visitor navigation (P1)
- Performance is already strong: FCP <400ms on 11/12 pages, HTML
  decoded sizes 38-177 KB, 24-27 resources per page
- AI-agent surface (4 llms.txt variants + Schema.org JSON-LD + sitemap)
  is exemplary

No source/conf/CSS changes in this PR; recommendations are queued as
separate follow-up PRs after triage.

Captured via Microsoft Playwright MCP server. ~50 screenshots stored as
WebP (most) + JPG (long pages exceeding WebP's 16383px height limit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall assessment

✅ Looks good

This PR does not change any estimator, weighting, variance/SE path, identification assumption, or default behavior. I cross-checked the repo state against docs/methodology/REGISTRY.md; the only reviewable risk here is whether the new audit artifact is accurate and safe for follow-up work.

Executive summary

  • The methodology surface is unchanged, but the audit text currently mislabels the non-rendered methodology markdown as “orphaned” and briefly suggests deleting it, even though those files are tracked elsewhere in the repo as active methodology dependencies.
  • Two evidence summaries overstate what the raw captures show: the EthicalAds ERR_ABORTED issue is described as happening on every page, and the timing JSONs are described as uniform even though 01-index-timing.json uses a different schema and omits resourceCount.
  • The rest of the audit mostly lines up with the current repo: pydata-sphinx-theme>=0.15 is still the floor, myst-parser is absent, and the docs/methodology/*.md files do exist without Sphinx rendering.
  • I found one minor artifact hygiene issue: an extra duplicate detail screenshot is committed but not referenced by the report.

Methodology

  • P2 audits/ux-2026-05/README.md:L67-L70: The IA section says the methodology markdown files are “orphaned” and proposes “add myst-parser extension OR delete the orphaned methodology .md files.” That deletion framing is misleading: these files are still first-class methodology sources in the repo, e.g. docs/doc-deps.yaml:L334-L343 and docs/doc-deps.yaml:L695-L703 declare docs/methodology/continuous-did.md and docs/methodology/survey-theory.md as methodology dependencies, and diff_diff/guides/llms-practitioner.txt:L501-L505 points readers to docs/methodology/REPORTING.md. Impact: a follow-up PR could treat deletion as an endorsed cleanup path and remove source material the project relies on for methodology/documentation alignment. Concrete fix: reword this to “not Sphinx-rendered for human visitors” and keep the allowed follow-ups to “render” or “relocate with updated references,” not delete.

Code Quality

  • No findings. This PR adds audit artifacts only; no executable code paths changed.

Performance

  • P2 audits/ux-2026-05/README.md:L197-L198 and audits/ux-2026-05/console-errors.md:L11-L11, L34-L36: Both summaries say the EthicalAds net::ERR_ABORTED requests occur on every page / two per page, but the raw captures for at least audits/ux-2026-05/perf/desktop/02-quickstart-network.txt:L1-L26, 10-api-autosummary-CallawaySantAnna-network.txt:L1-L25, and 11-references-network.txt:L1-L26 contain no failed requests at all. Impact: the audit overstates the consistency of the RTD/EthicalAds issue, which weakens trust in the evidence table and could mis-prioritize follow-up work. Concrete fix: revise the summary to match the raw captures exactly, e.g. “observed on 9 of 12 pages,” or regenerate the affected runs if the intent was to normalize capture conditions.
  • P2 audits/ux-2026-05/README.md:L26-L26 and L174-L189 vs. audits/ux-2026-05/perf/desktop/01-index-timing.json:L1-L9: The README describes a uniform per-page timing JSON containing transfer size and resource count, but 01-index-timing.json uses a different key set (transferSize_KB, decodedBodySize_KB, FP_ms) and omits resourceCount; the table then inserts (24 baseline) for index resources even though that value is not present in the raw JSON. Impact: any downstream analysis has to special-case the first page, and the report currently mixes recorded metrics with inferred ones without saying so. Concrete fix: regenerate or normalize 01-index-timing.json to the same schema as the other 11 files, or explicitly annotate the index row as manually reconstructed from the network log.

Maintainability

  • P3 audits/ux-2026-05/README.md:L298-L304: The appendix says there are 6 detail crops, but the PR also adds audits/ux-2026-05/screenshots/current/details/02-mobile-search-keypress.webp, which is byte-identical to 02-mobile-search-modal-empty.webp and is not referenced anywhere in the report. Impact: minor repo bloat/confusion in what is supposed to be a fixed evidence set. Concrete fix: remove the duplicate file or add a sentence explaining why both names are intentionally kept.

Tech Debt

  • No new blocker-level tech debt. The only notable debt introduced here is the small evidence-hygiene issue above.

Security

  • No findings. I did not see secrets, credentials, or non-public content in the added artifacts.

Documentation/Tests

  • P3 audits/ux-2026-05/console-errors.md:L9-L10 vs. audits/ux-2026-05/perf/desktop/01-index-console.txt:L1-L2 and 02-quickstart-console.txt:L1-L2: The summary says “2-4 console messages per page were emitted at lower severity,” but the raw captures show 4 on the index page and 5 on most others. Impact: minor factual drift in the report text. Concrete fix: update the range to 4-5, or just avoid a range and report the exact per-page counts already present in the raw files.

@igerber igerber closed this May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant