Enhance SEO with llms.txt, sitemap updates, and RSS feed support#21
Enhance SEO with llms.txt, sitemap updates, and RSS feed support#21CourtneyFradreck wants to merge 2 commits into
Conversation
✅ Deploy Preview for courtnotes ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughThis pull request extends a static site with RSS feed generation, LLM-accessible content, and enhanced SEO metadata. Changes include new feed templates ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR enhances the site’s SEO/discoverability by adding llms.txt and an RSS feed, improving sitemap metadata, and tightening on-page SEO tags for better indexing and previews.
Changes:
- Add
llms.txtand expose it viarobots.txt. - Introduce an RSS feed (
feed.xml) and add RSS autodiscovery in SEO markup. - Improve sitemap entries (priorities) and add image dimension/loading hints across key templates.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/sitemap.xml.njk | Adds per-URL <priority> values to sitemap entries. |
| src/robots.txt.njk | Adds a pointer to llms.txt alongside the sitemap reference. |
| src/llms.txt.njk | Adds an llms.txt endpoint with site summary and post links. |
| src/index.njk | Adds homepage description and image performance attributes. |
| src/feed.xml.njk | Adds an RSS 2.0 feed rendering posts. |
| src/blog.njk | Adds image dimension/loading attributes to blog snippets. |
| src/_includes/seo.njk | Adds RSS autodiscovery, robots meta, conditional OG/schema output. |
| src/_includes/individual-blog.njk | Removes duplicate <title> and tweaks post hero image attributes (but adds a duplicate RSS link). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| <title>{{ metadata.title }}</title> | ||
| <link>{{ metadata.siteUrl }}</link> | ||
| <description>Personal blog by {{ metadata.author }} — covering software engineering, startup building, and learning in public.</description> | ||
| <language>en</language> | ||
| <atom:link href="{{ metadata.siteUrl }}/feed.xml" rel="self" type="application/rss+xml"/> | ||
| {%- for post in collections.post | reverse %} | ||
| <item> | ||
| <title>{{ post.data.title }}</title> | ||
| <link>{{ metadata.siteUrl }}{{ post.url }}</link> | ||
| <guid isPermaLink="true">{{ metadata.siteUrl }}{{ post.url }}</guid> | ||
| <description>{{ post.data.description }}</description> | ||
| <pubDate>{{ post.date.toUTCString() }}</pubDate> | ||
| <author>{{ metadata.author }}</author> |
There was a problem hiding this comment.
RSS XML output needs escaping: at least one post title contains an ampersand (e.g., “AgriLease & Nexpay”), and unescaped & (or <, >, etc.) will produce invalid XML in <title>/<description>. Please XML-escape these fields (or wrap in CDATA) when rendering the feed items.
| <title>{{ metadata.title }}</title> | |
| <link>{{ metadata.siteUrl }}</link> | |
| <description>Personal blog by {{ metadata.author }} — covering software engineering, startup building, and learning in public.</description> | |
| <language>en</language> | |
| <atom:link href="{{ metadata.siteUrl }}/feed.xml" rel="self" type="application/rss+xml"/> | |
| {%- for post in collections.post | reverse %} | |
| <item> | |
| <title>{{ post.data.title }}</title> | |
| <link>{{ metadata.siteUrl }}{{ post.url }}</link> | |
| <guid isPermaLink="true">{{ metadata.siteUrl }}{{ post.url }}</guid> | |
| <description>{{ post.data.description }}</description> | |
| <pubDate>{{ post.date.toUTCString() }}</pubDate> | |
| <author>{{ metadata.author }}</author> | |
| <title>{{ metadata.title | escape }}</title> | |
| <link>{{ metadata.siteUrl | escape }}</link> | |
| <description>Personal blog by {{ metadata.author | escape }} — covering software engineering, startup building, and learning in public.</description> | |
| <language>en</language> | |
| <atom:link href="{{ (metadata.siteUrl ~ '/feed.xml') | escape }}" rel="self" type="application/rss+xml"/> | |
| {%- for post in collections.post | reverse %} | |
| <item> | |
| <title>{{ post.data.title | escape }}</title> | |
| <link>{{ (metadata.siteUrl ~ post.url) | escape }}</link> | |
| <guid isPermaLink="true">{{ (metadata.siteUrl ~ post.url) | escape }}</guid> | |
| <description>{{ post.data.description | escape }}</description> | |
| <pubDate>{{ post.date.toUTCString() }}</pubDate> | |
| <author>{{ metadata.author | escape }}</author> |
| <guid isPermaLink="true">{{ metadata.siteUrl }}{{ post.url }}</guid> | ||
| <description>{{ post.data.description }}</description> | ||
| <pubDate>{{ post.date.toUTCString() }}</pubDate> | ||
| <author>{{ metadata.author }}</author> |
There was a problem hiding this comment.
<author> in RSS 2.0 items is intended to be an email address (often in the form email (name)); a plain display name may fail validation in stricter feed readers. Consider removing <author>, adding a real email, or switching to dc:creator (with the proper namespace) for a name-only author field.
| <link rel="shortcut icon" href="../assets/logo.svg" type="image/x-icon"> | ||
| <title>{{ title }}</title> | ||
| <link rel="stylesheet" href="/individual.css"> | ||
| <link rel="alternate" type="application/rss+xml" title="{{ metadata.title }} RSS Feed" href="{{ '/feed.xml' | absoluteUrl(metadata.siteUrl) }}"> |
There was a problem hiding this comment.
This duplicates the RSS autodiscovery link already emitted by seo.njk (included on line 6). Keeping both results in two identical <link rel="alternate" ...> tags in the page head; remove one (prefer keeping it centralized in seo.njk).
| <link rel="alternate" type="application/rss+xml" title="{{ metadata.title }} RSS Feed" href="{{ '/feed.xml' | absoluteUrl(metadata.siteUrl) }}"> |
| Allow: / | ||
|
|
||
| Sitemap: {{ metadata.siteUrl }}/sitemap.xml | ||
| Llms-Txt: {{ metadata.siteUrl }}/llms.txt |
There was a problem hiding this comment.
Llms-Txt: is not a standard robots.txt directive, so most crawlers will ignore it (and some validators may flag it). If the goal is discoverability, consider relying on the well-known /llms.txt path (and/or linking it from HTML) rather than adding a non-standard robots.txt field.
| Llms-Txt: {{ metadata.siteUrl }}/llms.txt |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
src/_includes/individual-blog.njk (1)
9-9: Drop the duplicate RSS discovery tag.
seo.njkis already included on Line 6, and it now emits the same RSS<link rel="alternate">insrc/_includes/seo.njkLine 26. Keeping both means each post page advertises the same feed twice.✂️ Suggested cleanup
- <link rel="alternate" type="application/rss+xml" title="{{ metadata.title }} RSS Feed" href="{{ '/feed.xml' | absoluteUrl(metadata.siteUrl) }}">🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/_includes/individual-blog.njk` at line 9, Remove the duplicate RSS discovery tag from the individual blog template: delete the explicit <link rel="alternate" type="application/rss+xml" ...> line in src/_includes/individual-blog.njk (the one that builds "{{ '/feed.xml' | absoluteUrl(metadata.siteUrl) }}") because seo.njk already emits the same tag; keep the seo.njk emission (src/_includes/seo.njk Line 26) and ensure only that single RSS <link rel="alternate"> remains on post pages.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/_includes/seo.njk`:
- Around line 54-111: The JSON-LD blocks (BlogPosting, WebSite, WebPage) inject
unescaped template variables (e.g., title, description, author, metadata.title,
metadata.author, seoTitle, fullImage, absolutePageUrl, date) directly into JSON;
create/register a Nunjucks filter (e.g., toJSON) that returns
JSON.stringify(value) and then update every dynamic insertion inside the three
conditional scripts (the isPost BlogPosting block, the Home WebSite block, and
the default WebPage block) to pipe the variables through that filter (e.g., use
title | toJSON, description | toJSON, author | toJSON, metadata.title | toJSON,
etc.), and ensure date values are converted/escaped similarly
(date.toISOString() | toJSON or pre-format before filtering) so all injected
values are valid, escaped JSON strings.
In `@src/feed.xml.njk`:
- Around line 15-19: Escape XML-sensitive characters when rendering RSS item
fields in src/feed.xml.njk: replace raw uses of post.data.title and
post.data.description with XML-escaped equivalents (e.g., helper/filters that
escape &, <, >, " and '), or wrap description in a CDATA block before emitting;
ensure link/guid composed from metadata.siteUrl and post.url remain unchanged
but escaped if they can contain special characters, and leave pubDate as
post.date.toUTCString(). Update the template to call the escape helper (or use
CDATA for description) wherever post.data.title or post.data.description are
rendered.
In `@src/llms.txt.njk`:
- Around line 5-7: Templates are using two different data sources (site.* in
src/llms.txt.njk vs metadata.* in sitemap.xml.njk and robots.txt.njk); pick one
and make all templates consistent—I suggest standardizing on metadata.*: change
src/llms.txt.njk to use metadata.title and metadata.description instead of
site.title/site.description, and update metadata.js to export description (or
sync .eleventy.js siteMetadata.description into metadata.js) so metadata
contains title, author, siteUrl, and description; ensure any code referencing
siteMetadata is updated to reference metadata where applicable.
In `@src/robots.txt.njk`:
- Line 9: Remove the non-standard "Llms-Txt: {{ metadata.siteUrl }}/llms.txt"
directive from src/robots.txt.njk; either delete that line or replace it with a
standard directive such as "Sitemap: {{ metadata.siteUrl }}/llms.txt" so
crawlers can discover the llms.txt via the recognized Sitemap directive, and
ensure the llms.txt file is served at the standard path (/llms.txt).
---
Nitpick comments:
In `@src/_includes/individual-blog.njk`:
- Line 9: Remove the duplicate RSS discovery tag from the individual blog
template: delete the explicit <link rel="alternate" type="application/rss+xml"
...> line in src/_includes/individual-blog.njk (the one that builds "{{
'/feed.xml' | absoluteUrl(metadata.siteUrl) }}") because seo.njk already emits
the same tag; keep the seo.njk emission (src/_includes/seo.njk Line 26) and
ensure only that single RSS <link rel="alternate"> remains on post pages.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 75de40cd-217d-40be-b2cb-f3d2757fb47f
📒 Files selected for processing (8)
src/_includes/individual-blog.njksrc/_includes/seo.njksrc/blog.njksrc/feed.xml.njksrc/index.njksrc/llms.txt.njksrc/robots.txt.njksrc/sitemap.xml.njk
| {# Schema.org JSON-LD — conditional by page type #} | ||
| {% if isPost %} | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "Blog", | ||
| "name": "{{ seoTitle }}", | ||
| "url": "{{ absolutePageUrl }}", | ||
| "@type": "BlogPosting", | ||
| "mainEntityOfPage": { | ||
| "@type": "WebPage", | ||
| "@id": "{{ absolutePageUrl }}" | ||
| }, | ||
| "headline": "{{ title }}", | ||
| "description": "{{ description }}", | ||
| "image": ["{{ fullImage }}"], | ||
| "author": { | ||
| "@type": "Person", | ||
| "name": "{{ author }}" | ||
| }, | ||
| "publisher": { | ||
| "@type": "Organization", | ||
| "name": "{{ metadata.title }}", | ||
| "logo": { | ||
| "@type": "ImageObject", | ||
| "url": "{{ '/assets/logo.svg' | absoluteUrl(metadata.siteUrl) }}" | ||
| } | ||
| }, | ||
| "datePublished": "{{ date.toISOString() if date else '2024-01-01' }}", | ||
| "dateModified": "{{ date.toISOString() if date else '2024-01-01' }}" | ||
| } | ||
| </script> | ||
| {% elif title == "Home" %} | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "WebSite", | ||
| "name": "{{ metadata.title }}", | ||
| "url": "{{ metadata.siteUrl }}", | ||
| "description": "{{ description }}", | ||
| "author": { | ||
| "@type": "Person", | ||
| "name": "{{ metadata.author }}" | ||
| } | ||
| } | ||
| </script> | ||
| {% else %} | ||
| <script type="application/ld+json"> | ||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "BlogPosting", | ||
| "mainEntityOfPage": { | ||
| "@type": "WebPage", | ||
| "@id": "{{ absolutePageUrl }}" | ||
| }, | ||
| "headline": "{{ seoTitle | safe }}", | ||
| "description": "{{ description | safe }}", | ||
| "image": ["{{ fullImage }}"], | ||
| "author": { | ||
| "@type": "Person", | ||
| "name": "{{ author or metadata.author }}" | ||
| }, | ||
| "publisher": { | ||
| "@type": "Organization", | ||
| "name": "{{ seoTitle }}", | ||
| "logo": { | ||
| "@type": "ImageObject", | ||
| "url": "{{ '/assets/logo.svg' | absoluteUrl(metadata.siteUrl) }}" | ||
| } | ||
| }, | ||
| "datePublished": "{{ date or '2024-01-01' }}", | ||
| "dateModified": "{{ date or '2024-01-01' }}" | ||
| } | ||
| </script> | ||
|
No newline at end of file |
||
| { | ||
| "@context": "https://schema.org", | ||
| "@type": "WebPage", | ||
| "name": "{{ seoTitle }}", | ||
| "url": "{{ absolutePageUrl }}", | ||
| "description": "{{ description }}", | ||
| "author": { | ||
| "@type": "Person", | ||
| "name": "{{ metadata.author }}" | ||
| } | ||
| } | ||
| </script> | ||
| {% endif %} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "seo.njk" -type fRepository: CourtneyFradreck/CourtNotes
Length of output: 94
🏁 Script executed:
cat -n ./src/_includes/seo.njk | head -120Repository: CourtneyFradreck/CourtNotes
Length of output: 4898
🏁 Script executed:
find . -name ".eleventy.js" -o -name "eleventy.config.js" -o -name "*.config.js" | head -5Repository: CourtneyFradreck/CourtNotes
Length of output: 85
🏁 Script executed:
cat -n ./.eleventy.jsRepository: CourtneyFradreck/CourtNotes
Length of output: 2143
🏁 Script executed:
rg -i "filter|tojson|jsonencode|jsonescape" --type=js . | head -20Repository: CourtneyFradreck/CourtNotes
Length of output: 415
🏁 Script executed:
cat -n package.jsonRepository: CourtneyFradreck/CourtNotes
Length of output: 493
🏁 Script executed:
find src -name "*.md" -type f | head -3 | xargs head -30Repository: CourtneyFradreck/CourtNotes
Length of output: 5280
🏁 Script executed:
rg -i "nunjucks|njk" package.jsonRepository: CourtneyFradreck/CourtNotes
Length of output: 53
🏁 Script executed:
rg "tojson|toJSON" src/ --type=njkRepository: CourtneyFradreck/CourtNotes
Length of output: 98
🏁 Script executed:
rg "tojson|toJSON" src/Repository: CourtneyFradreck/CourtNotes
Length of output: 53
Apply JSON escaping to all dynamic values in JSON-LD blocks.
These template variables (title, description, author, metadata.title, metadata.author, seoTitle) are interpolated directly into JSON without encoding. Any double quotes, backslashes, or newlines in these values will produce invalid JSON. Register a Nunjucks filter (e.g., toJSON or escape) that applies JSON.stringify() to these values, or wrap them explicitly with proper JSON encoding.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/_includes/seo.njk` around lines 54 - 111, The JSON-LD blocks
(BlogPosting, WebSite, WebPage) inject unescaped template variables (e.g.,
title, description, author, metadata.title, metadata.author, seoTitle,
fullImage, absolutePageUrl, date) directly into JSON; create/register a Nunjucks
filter (e.g., toJSON) that returns JSON.stringify(value) and then update every
dynamic insertion inside the three conditional scripts (the isPost BlogPosting
block, the Home WebSite block, and the default WebPage block) to pipe the
variables through that filter (e.g., use title | toJSON, description | toJSON,
author | toJSON, metadata.title | toJSON, etc.), and ensure date values are
converted/escaped similarly (date.toISOString() | toJSON or pre-format before
filtering) so all injected values are valid, escaped JSON strings.
| <title>{{ post.data.title }}</title> | ||
| <link>{{ metadata.siteUrl }}{{ post.url }}</link> | ||
| <guid isPermaLink="true">{{ metadata.siteUrl }}{{ post.url }}</guid> | ||
| <description>{{ post.data.description }}</description> | ||
| <pubDate>{{ post.date.toUTCString() }}</pubDate> |
There was a problem hiding this comment.
Escape RSS item fields before writing XML.
post.data.title and post.data.description are emitted raw here. A post containing &, <, or > will produce malformed feed.xml, and some readers will reject the feed entirely. Please XML-escape these values, or wrap the description safely in CDATA before rendering.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/feed.xml.njk` around lines 15 - 19, Escape XML-sensitive characters when
rendering RSS item fields in src/feed.xml.njk: replace raw uses of
post.data.title and post.data.description with XML-escaped equivalents (e.g.,
helper/filters that escape &, <, >, " and '), or wrap description in a CDATA
block before emitting; ensure link/guid composed from metadata.siteUrl and
post.url remain unchanged but escaped if they can contain special characters,
and leave pubDate as post.date.toUTCString(). Update the template to call the
escape helper (or use CDATA for description) wherever post.data.title or
post.data.description are rendered.
| # {{ site.title }} | ||
|
|
||
| > {{ site.description }} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Compare site metadata sources to detect inconsistencies
echo "=== Checking .eleventy.js siteMetadata ==="
rg -A 5 'const siteMetadata = \{' .eleventy.js
echo -e "\n=== Checking src/_data/metadata.js ==="
cat src/_data/metadata.js
echo -e "\n=== Checking other templates' data source usage ==="
rg -n 'metadata\.(siteUrl|title|author)' --type=njk src/
rg -n 'site\.(siteUrl|title|author|description)' --type=njk src/Repository: CourtneyFradreck/CourtNotes
Length of output: 578
🏁 Script executed:
#!/bin/bash
# Find all .njk template files mentioned in the review
echo "=== Locating .njk template files ==="
fd -e njk src/
echo -e "\n=== Content of llms.txt.njk ==="
cat src/llms.txt.njk
echo -e "\n=== Content of sitemap.xml.njk ==="
cat src/sitemap.xml.njk
echo -e "\n=== Content of robots.txt.njk ==="
cat src/robots.txt.njkRepository: CourtneyFradreck/CourtNotes
Length of output: 2510
🏁 Script executed:
#!/bin/bash
# Check how siteMetadata is exported/registered in .eleventy.js
echo "=== Checking .eleventy.js for how siteMetadata is made available ==="
cat .eleventy.jsRepository: CourtneyFradreck/CourtNotes
Length of output: 1855
Standardize template data source usage.
This template uses site.* variables, but sitemap.xml.njk and robots.txt.njk use metadata.* instead. The sources diverge—metadata.js exports only title, author, and siteUrl, while .eleventy.js defines siteMetadata with an additional description field. This inconsistency creates maintenance risk and potential for bugs if the values diverge further.
Standardize on a single data source across all templates.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/llms.txt.njk` around lines 5 - 7, Templates are using two different data
sources (site.* in src/llms.txt.njk vs metadata.* in sitemap.xml.njk and
robots.txt.njk); pick one and make all templates consistent—I suggest
standardizing on metadata.*: change src/llms.txt.njk to use metadata.title and
metadata.description instead of site.title/site.description, and update
metadata.js to export description (or sync .eleventy.js siteMetadata.description
into metadata.js) so metadata contains title, author, siteUrl, and description;
ensure any code referencing siteMetadata is updated to reference metadata where
applicable.
| Allow: / | ||
|
|
||
| Sitemap: {{ metadata.siteUrl }}/sitemap.xml | ||
| Llms-Txt: {{ metadata.siteUrl }}/llms.txt |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Is "Llms-Txt" a recognized directive in robots.txt for advertising llms.txt files?
💡 Result:
No, "Llms-Txt" is not a recognized directive in robots.txt for advertising llms.txt files. The robots.txt standard, as defined by robotstxt.org, the original 1994 specification, RFC 9309, and Google's implementation, only recognizes specific directives: User-agent, Disallow, Allow, and Sitemap. No source mentions "Llms-Txt" as a valid directive. llms.txt is a separate proposed standard (from llmstxt.org, 2024) for AI guidance, distinct from robots.txt. Some non-standard practices suggest adding Allow: /llms.txt or informal lines like "LLMS: https://example.com/llms.txt" in robots.txt to hint at the file, but these are not part of the official specification and crawlers are not required to recognize or act on them.
Citations:
- 1: https://www.robotstxt.org/robotstxt
- 2: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec
- 3: https://ftp.registro.br/rfc/rfc9309.txt
- 4: https://www.robotstxt.org/orig.html
- 5: https://tools.ietf.org/html/rfc9309
- 6: https://llmstxt.org/
- 7: https://www.reddit.com/r/TechSEO/comments/1nlhvqa/can_robotstxt_be_used_to_allow_ai_crawling_of/
Remove the non-standard "Llms-Txt" directive from robots.txt.
"Llms-Txt" is not a recognized directive in the robots.txt specification (RFC 9309). Standard robots.txt parsers only recognize User-agent, Disallow, Allow, and Sitemap. This directive will be ignored by all standard crawlers.
The llms.txt standard is a separate specification (from llmstxt.org, 2024) distinct from robots.txt. To advertise an llms.txt file, use the official Sitemap directive instead, or ensure the file is accessible at the standard path (/llms.txt) where AI crawlers can discover it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/robots.txt.njk` at line 9, Remove the non-standard "Llms-Txt: {{
metadata.siteUrl }}/llms.txt" directive from src/robots.txt.njk; either delete
that line or replace it with a standard directive such as "Sitemap: {{
metadata.siteUrl }}/llms.txt" so crawlers can discover the llms.txt via the
recognized Sitemap directive, and ensure the llms.txt file is served at the
standard path (/llms.txt).
Summary by CodeRabbit
New Features
Improvements