feat: add linkable heading anchors to PSF pages#2966
feat: add linkable heading anchors to PSF pages#2966nagasrisai wants to merge 7 commits intopython:mainfrom
Conversation
Adds a custom template filter that post-processes rendered page HTML to inject id attributes and pilcrow self-link anchors into h2-h4 headings. Duplicate slugs get a -N suffix to prevent id collisions. Headings that already carry an id are left untouched. Part of python#2349
Loads the new page_tags library and pipes page content through the add_heading_anchors filter so that every h2-h4 in a PSF page (including the board resolutions listing) gets a stable id attribute and a pilcrow anchor link for direct linking. Part of python#2349
10 test cases covering: h2/h3/h4 processing, h1/h5 exclusion, duplicate-slug deduplication, existing-id passthrough, nested HTML stripping, non-heading passthrough, empty string, empty text, and anchor placement inside the heading element.
|
@JacobCoffee @ewdurbin @sethmlarson — could one of you review when you get a chance? This adds anchor IDs to headings on PSF pages so sections like the board resolutions can be linked to directly. 10 tests included. Thanks! |
|
The Actions workflows need maintainer approval to run since this is coming from a fork. Could @JacobCoffee or @ewdurbin approve the CI, Lint, and Check collectstatic runs from the Checks tab when you get a chance? Thanks! |
|
@nagasrisai hi please dont @ specific people, CODEOWNERS will handle assignment for review |
apps/pages/templatetags/page_tags.py
Outdated
|
|
||
| register = template.Library() | ||
|
|
||
| # Match h2–h4 elements; capture tag name, existing attributes, and inner HTML. |
There was a problem hiding this comment.
@hugovk h1 is the page title ,there's only ever one per page and the page URL already points to it, so an anchor on it wouldn't be useful. The sections people actually want to link into are h2 and below (e.g. individual board resolutions or meeting minutes). This is also the convention most docs sites follow, including GitHub's own markdown renderer.
There was a problem hiding this comment.
There are some pages with multiple H1s, such as https://www.python.org/downloads/release/python-31312/.
Does this only run on pages of board meeting resolutions/minutes? Do they all at most have a single H1?
There was a problem hiding this comment.
@hugovk Good catch ,the filter is only wired up to templates/psf/default.html, not the general pages template. So the Python release pages (like the downloads page you linked) go through a different template and won't be touched by this at all.
For PSF pages specifically, the H1 is always the page title coming from the CMS page.title field, rendered directly in the template as <h1 class="page-title">{{ page.title }}</h1> — separate from page.content where the filter runs. So the content fed into add_heading_anchors shouldn't contain any H1s.
That said, happy to extend the regex to h1–h4 if you'd prefer a more defensive approach. just let me know.
|
Yep, tested it locally! Here's what it produces on a few real cases: A plain h2 heading gets an id and the pilcrow link injected right inside it: Duplicate headings on the same page get deduped cleanly: A heading that already has an id is left completely alone — no double-processing: And on something that looks like an actual PSF board minutes page: h1 is untouched throughout, and all 10 tests in the test file pass. |
|
I'm confused, because https://www.python.org/psf/records/board/resolutions/ doesn't have any h2-h4 (except one h3 in the sidebar). |
|
@hugovk You're right, and good catch. I looked at the actual page source —the resolutions page content is RST-generated, so the section entries use h1 tags rather than h2-h4, which means the filter wouldn't add pilcrow links there. Those sections do already have IDs on the wrapping divs though, so they're technically linkable, just without a visual anchor. That said, other PSF pages do have h2/h3 content headings ,the bylaws page for instance has 30+ headings across h2 and h3. That's more the kind of page this would help. So fair question about scope , if the main goal was specifically the resolutions/minutes pages, the filter would need to cover h1 as well. Happy to extend it to h1-h4 if that makes more sense, or narrow the scope to the pages where h2-h4 headings actually appear. Let me know what you think. |
How can you tell?
Whether the source is RST or MD or HTML is irrelevant, all can do headings down to h4 and beyond.
Are you sure? I don't see it. Please give an example.
OK, but this issue is specifically about resolutions. And when I test the bylaws page with this PR locally, I don't see any linkable headings. |
Two bugs fixed: 1. The regex only matched h2-h4, so RST-generated pages like the board resolutions page (which use h1 section headings) received no anchors. Extended to h1-h4. 2. Headings that already carry an id attribute (docutils/RST injects these automatically on every section heading) were silently skipped. The filter now reuses the existing id and injects the pilcrow link using it, which is exactly what is needed for RST-sourced pages like the bylaws and resolutions pages. Also added idempotency guard so running the filter twice is safe.
Reflects two changes to the filter: - h1 headings are now processed (not just h2-h4) - headings with existing ids now get pilcrow links injected New tests added: RST-generated headings, idempotency guard, h1 processing.
|
@hugovk You're right on both points, and I've now identified and fixed the two actual bugs. The bylaws page not showing anchors locally is the main one. RST content in pythondotorg is rendered by docutils, which automatically adds The resolutions page was also missed because the regex only covered h2–h4, and docutils generates those sections as h1. Extending to h1–h4 fixes that. Two commits just pushed: one for the filter, one to update the tests (added tests for RST-generated headings with existing ids, idempotency, and h1 processing). |
Adds linkable anchor IDs to headings on PSF pages so that individual sections can be shared as direct URLs, which is what #2349 is asking for.
Added a
add_heading_anchorstemplate filter inapps/pages/templatetags/page_tags.py. It processes the rendered page HTML and injects anidattribute into each h2, h3, and h4 element based on the slugified heading text, along with a small pilcrow (¶) anchor link. Applied the filter intemplates/psf/default.htmlso it covers the board resolutions page and all other PSF pages automatically.Duplicate heading texts get a
-2,-3suffix to keep IDs unique, and headings that already carry anidare left untouched. Ten tests included inapps/pages/tests/test_templatetags.py.Closes #2349