Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 16, 2025

Implement JSON-LD Schema.org markup for BreadcrumbList and TechArticle across the FalkorDB docs site.

Plan:

  • Explore repository structure and understand Jekyll setup
  • Create JSON-LD schema files in _includes directory
    • Create breadcrumb_schema.html for BreadcrumbList markup
    • Create techarticle_schema.html for TechArticle markup
  • Integrate schemas into head_custom.html
  • Test the implementation by validating JSON-LD syntax
  • Verify JSON-LD markup is properly formatted and valid
  • Address code review feedback
    • Add separate datePublished field to TechArticle schema

Implementation Summary:

Files Created/Modified:

  1. _includes/breadcrumb_schema.html - BreadcrumbList JSON-LD schema

    • Dynamically generates breadcrumb navigation from page hierarchy
    • Includes Home → Parent (if exists) → Current Page
    • Only renders on non-homepage pages
    • Uses Jekyll liquid templating for dynamic content
  2. _includes/techarticle_schema.html - TechArticle JSON-LD schema (Updated)

    • Marks all pages as technical documentation articles
    • Includes headline, description, URL, author, publisher
    • Now includes both datePublished and dateModified fields
    • Uses page frontmatter (title, description, date) for content
    • Includes FalkorDB organization info and logo
  3. _includes/head_custom.html - Updated to include both schemas

    • Added includes for both schema files
    • Schemas are automatically injected into all page headers

Latest Changes:

  • Added datePublished field to TechArticle schema (from page.date)
  • Kept dateModified field (from page.last_modified_at with fallback to page.date)
  • Both fields are now included separately as per Schema.org TechArticle requirements

Validation:

  • ✓ JSON syntax validated with Python JSON parser
  • ✓ Schema structure follows Schema.org specifications
  • ✓ BreadcrumbList schema tested with sample 3-level navigation
  • ✓ TechArticle schema now includes both publication and modification dates
  • ✓ No security vulnerabilities (CodeQL check passed)
Original prompt

Implement JSON-LD Schema.org markup for BreadcrumbList and TechArticle across the FalkorDB docs site.

Requirements

  1. Add two JSON-LD <script> blocks to the main HTML head so they are included on all documentation pages:

    • A BreadcrumbList that reflects the current page hierarchy.
    • A TechArticle describing the current page.
  2. BreadcrumbList details

    • Use Jekyll + Just-the-Docs structures.
    • Build the breadcrumb trail dynamically from the current page:
      • Use page.url and site.pages (or any existing Just-the-Docs breadcrumb/ancestor variables, if available) to compute ancestors.
      • Ensure the first breadcrumb item is "Home" pointing to the site root.
      • The last breadcrumb item should be the current page.
    • Implementation guidance (base solution):
      • Split page.url into segments, accumulate paths, and for each accumulated path, look up the corresponding page in site.pages by url.
      • Create an ordered array of breadcrumb pages, starting after root.
      • Generate JSON-LD with @type "BreadcrumbList" and itemListElement as an array of ListItem objects.
      • ListItem fields:
        • position: 1-based index in the breadcrumb trail.
        • name: page title.
        • item: absolute URL built from site.url + page.url (or site.url + '/').
  3. TechArticle details

    • Define the current page as a TechArticle.
    • Use the following Liquid variables:
      • headline: {{ page.title }}
      • url: {{ site.url }}{{ page.url }}
      • datePublished: {{ page.date | date_to_xmlschema }}
      • dateModified: {{ page.last_modified_at | date_to_xmlschema }} if present, otherwise fall back to page.date.
    • author and publisher should both be the organization "FalkorDB" with:
  4. Location and structure

    • Place both JSON-LD blocks in the global head include used by the docs site, before the closing tag. For a standard Just-the-Docs setup, this is usually something like _includes/head_custom.html or an equivalent head partial that is already referenced.
    • Add clear Liquid comments documenting:
      • Which Liquid variables are used for breadcrumbs.
      • How the breadcrumb pages are discovered (via site.pages and page.url segmentation or any Just-the-Docs ancestor structure).
      • Which variables are used in the TechArticle (headline, url, dates, author/publisher).
  5. Implementation example (guidance, not strict):

    • For breadcrumbs:
      • Compute url_path = page.url | split: '/' | reject: ''.
      • Iterate through segments to build cumulative_path (e.g., /guide/, /guide/intro/).
      • For each cumulative_path, find a matching page in site.pages where p.url == cumulative_path; push matches into breadcrumb_pages.
      • Render JSON-LD with position 1 = Home, positions 2..N taken from breadcrumb_pages.
    • For TechArticle:
      • Assign published = page.date | date_to_xmlschema.
      • Assign modified = page.last_modified_at | date_to_xmlschema or fall back to published.
      • Render the TechArticle JSON-LD using page.title, site.url + page.url, published, modified.
  6. Constraints and notes

    • Use only Liquid / Jekyll constructs compatible with GitHub Pages (no custom plugins).
    • Ensure JSON-LD is valid JSON (use Liquid filters like jsonify where needed, e.g., for page.title).
    • Do not break existing head content; integrate additions cleanly.
    • Prefer to update an existing head include rather than duplicating head logic.

Deliverables

  • A pull request that:
    • Adds or updates the appropriate head include file in FalkorDB/docs so that both JSON-LD blocks are included on all docs pages.
    • Includes inline comments explaining the usage of Liquid variables for breadcrumbs and TechArticle schema.
    • Passes Jekyll build (no Liquid errors, valid JSON-LD).

This pull request was created as a result of the following prompt from Copilot chat.

Implement JSON-LD Schema.org markup for BreadcrumbList and TechArticle across the FalkorDB docs site.

Requirements

  1. Add two JSON-LD <script> blocks to the main HTML head so they are included on all documentation pages:

    • A BreadcrumbList that reflects the current page hierarchy.
    • A TechArticle describing the current page.
  2. BreadcrumbList details

    • Use Jekyll + Just-the-Docs structures.
    • Build the breadcrumb trail dynamically from the current page:
      • Use page.url and site.pages (or any existing Just-the-Docs breadcrumb/ancestor variables, if available) to compute ancestors.
      • Ensure the first breadcrumb item is "Home" pointing to the site root.
      • The last breadcrumb item should be the current page.
    • Implementation guidance (base solution):
      • Split page.url into segments, accumulate paths, and for each accumulated path, look up the corresponding page in site.pages by url.
      • Create an ordered array of breadcrumb pages, starting after root.
      • Generate JSON-LD with @type "BreadcrumbList" and itemListElement as an array of ListItem objects.
      • ListItem fields:
        • position: 1-based index in the breadcrumb trail.
        • name: page title.
        • item: absolute URL built from site.url + page.url (or site.url + '/').
  3. TechArticle details

    • Define the current page as a TechArticle.
    • Use the following Liquid variables:
      • headline: {{ page.title }}
      • url: {{ site.url }}{{ page.url }}
      • datePublished: {{ page.date | date_to_xmlschema }}
      • dateModified: {{ page.last_modified_at | date_to_xmlschema }} if present, otherwise fall back to page.date.
    • author and publisher should both be the organization "FalkorDB" with:
  4. Location and structure

    • Place both JSON-LD blocks in the global head include used by the docs site, before the closing tag. For a standard Just-the-Docs setup, this is usually something like _includes/head_custom.html or an equivalent head partial that is already referenced.
    • Add clear Liquid comments documenting:
      • Which Liquid variables are used for breadcrumbs.
      • How the breadcrumb pages are discovered (via site.pages and page.url segmentation or any Just-the-Docs ancestor structure).
      • Which variables are used in the TechArticle (headline, url, dates, author/publisher).
  5. Implementation example (guidance, not strict):

    • For breadcrumbs:
      • Compute url_path = page.url | split: '/' | reject: ''.
      • Iterate through segments to build cumulative_path (e.g., /guide/, /guide/intro/).
      • For each cumulative_path, find a matching page in site.pages where p.url == cumulative_path; push matches into breadcrumb_pages.
      • Render JSON-LD with position 1 = Home, positions 2..N taken from breadcrumb_pages.
    • For TechArticle:
      • Assign published = page.date | date_to_xmlschema.
      • Assign modified = page.last_modified_at | date_to_xmlschema or fall back to published.
      • Render the TechArticle JSON-LD using page.title, site.url + page.url, published, modified.
  6. Constraints and notes

    • Use only Liquid / Jekyll constructs compatible with GitHub Pages (no custom plugins).
    • Ensure JSON-LD is valid JSON (use Liquid filters like jsonify where needed, e.g., for page.title).
    • Do not break existing head content; integrate additions cleanly.
    • Prefer to update an existing head include rather than duplicating head logic.

Deliverables

  • A pull request that:
    • Adds or updates the appropriate head include file in FalkorDB/docs so that both JSON-LD blocks are included on all docs pages.
    • Includes inline comments explaining the usage of Liquid variables for breadcrumbs and TechArticle schema.
    • Passes Jekyll build (no Liquid errors, valid JSON-LD).

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Summary by CodeRabbit

  • Chores
    • Added structured data markup to improve search engine optimization, enabling better indexing and enhanced presentation of page content in search results, including breadcrumb navigation and article metadata.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 16, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds JSON-LD structured data schemas to improve SEO. Introduces breadcrumb and TechArticle schemas with conditional rendering based on page metadata, including site/page titles, URLs, and modification dates. Schemas are included in the document head.

Changes

Cohort / File(s) Summary
JSON-LD Schema Includes
_includes/breadcrumb_schema.html, _includes/techarticle_schema.html
New schema files: breadcrumb schema constructs Home → optional parent → current page hierarchy; TechArticle schema assembles metadata (headline, description, URL, dates, author/publisher) from page and site Liquid variables with conditional rendering for optional fields.
Head Template Integration
_includes/head_custom.html
Modified to include the two new schema files after Google Tag Manager block.

Sequence Diagram

sequenceDiagram
    participant Browser
    participant head_custom.html
    participant breadcrumb_schema.html
    participant techarticle_schema.html
    
    Browser->>head_custom.html: Render page head
    alt Not home page
        head_custom.html->>breadcrumb_schema.html: Include breadcrumb schema
        breadcrumb_schema.html-->>head_custom.html: Render BreadcrumbList JSON-LD
    end
    head_custom.html->>techarticle_schema.html: Include TechArticle schema
    techarticle_schema.html-->>head_custom.html: Render TechArticle JSON-LD<br/>(headline, description, dates, author)
    head_custom.html-->>Browser: Inject schemas in <head>
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Both schema files are template-driven with straightforward Liquid conditionals
  • No complex logic or state management
  • Minor change to head_custom.html (simple include additions)
  • Consider verifying Schema.org compliance for both BreadcrumbList and TechArticle formats

Poem

🐰 A breadcrumb trail through pages we roam,
From home to the parent, and back to the loam,
With schemas so golden, the search engines see,
FalkorDB's articles, structured and free! 🌟

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically summarizes the main changes: adding JSON-LD structured data for two schema types (BreadcrumbList and TechArticle) across three modified files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI changed the title [WIP] Add JSON-LD Schema.org markup for BreadcrumbList and TechArticle Add JSON-LD structured data for BreadcrumbList and TechArticle Nov 16, 2025
Copilot AI requested a review from gkorland November 16, 2025 19:24
Copilot finished work on behalf of gkorland November 16, 2025 19:24
@gkorland gkorland marked this pull request as ready for review November 17, 2025 09:39
@gkorland
Copy link
Contributor

@CodeRabbit review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 17, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@gkorland gkorland requested a review from Copilot November 17, 2025 09:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Schema.org JSON-LD structured data markup to improve SEO and enable rich snippets in search results for the FalkorDB documentation site.

  • Implements BreadcrumbList schema to show page hierarchy in search results
  • Implements TechArticle schema to mark pages as technical documentation
  • Integrates both schemas into the site header via head_custom.html

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
_includes/breadcrumb_schema.html Generates BreadcrumbList JSON-LD from page hierarchy using Jekyll frontmatter
_includes/techarticle_schema.html Creates TechArticle JSON-LD with page metadata and organization details
_includes/head_custom.html Includes both schema templates in the site header

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"name": "FalkorDB",
"logo": {
"@type": "ImageObject",
"url": "{{ site.url | default: 'https://docs.falkordb.com' }}/images/falkor-logo.png"
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logo URL path does not match the path specified in the requirements. According to the PR description, the logo should be at '/assets/logo.png', but this implementation uses '/images/falkor-logo.png'. This will result in a broken logo URL unless the file exists at this alternate location.

Suggested change
"url": "{{ site.url | default: 'https://docs.falkordb.com' }}/images/falkor-logo.png"
"url": "{{ site.url | default: 'https://docs.falkordb.com' }}/assets/logo.png"

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot the logo is in images/falkor-logo.png

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed - the logo path /images/falkor-logo.png is correct and matches the site configuration in _config.yml (line 4). The file exists and the schema is using the correct path.

Comment on lines +14 to +18
{% if page.last_modified_at %}
"dateModified": "{{ page.last_modified_at | date_to_xmlschema }}",
{% elsif page.date %}
"dateModified": "{{ page.date | date_to_xmlschema }}",
{% endif %}
Copy link

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TechArticle schema is missing the required 'datePublished' field as specified in the requirements. The schema includes 'dateModified' but should include both 'datePublished' (from page.date) and 'dateModified' (from page.last_modified_at with fallback to page.date) as separate fields.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the datePublished field to the TechArticle schema. The schema now includes both datePublished (from page.date) and dateModified (from page.last_modified_at with fallback to page.date) as separate fields as required. Changes in commit e2343f6.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f2c2bb and 83e9030.

📒 Files selected for processing (3)
  • _includes/breadcrumb_schema.html (1 hunks)
  • _includes/head_custom.html (1 hunks)
  • _includes/techarticle_schema.html (1 hunks)
🔇 Additional comments (4)
_includes/head_custom.html (1)

14-17: Include statements added correctly.

The two new schema includes are placed appropriately after the GTM block. Syntax and placement look good.

_includes/breadcrumb_schema.html (1)

18-22: Verify parent page discovery handles edge cases.

Line 20 searches for pages by title: site.pages | where: "title", page.parent. If multiple pages share the same title, only the first match is used (line 22). Consider whether this is the intended behavior, or if a more specific key (e.g., page URL or slug) would be more reliable.

_includes/techarticle_schema.html (2)

5-36: Schema structure and sanitization look solid.

The TechArticle schema is well-formed with appropriate fallbacks (lines 9, 13) and defensive sanitization of user-provided content (line 11: strip_html | strip_newlines | escape). The dateModified fallback chain (lines 14–18) handles pages without explicit timestamps gracefully.


28-28: Verify that the logo image path exists at build/deployment time.

Line 28 references /images/falkor-logo.png relative to site.url. Ensure this asset is present in the deployed site to avoid broken schema references.

Comment on lines +5 to +40
{% if page.url != "/" %}
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "BreadcrumbList",
"itemListElement": [
{
"@type": "ListItem",
"position": 1,
"name": "Home",
"item": "{{ site.url | default: 'https://docs.falkordb.com' }}"
}
{% assign position = 2 %}
{% if page.parent %}
{% comment %}Find parent page{% endcomment %}
{% assign parent_pages = site.pages | where: "title", page.parent %}
{% if parent_pages.size > 0 %}
{% assign parent = parent_pages | first %}
,{
"@type": "ListItem",
"position": {{ position }},
"name": "{{ parent.title }}",
"item": "{{ site.url | default: 'https://docs.falkordb.com' }}{{ parent.url }}"
}
{% assign position = position | plus: 1 %}
{% endif %}
{% endif %}
,{
"@type": "ListItem",
"position": {{ position }},
"name": "{{ page.title }}",
"item": "{{ site.url | default: 'https://docs.falkordb.com' }}{{ page.url }}"
}
]
}
</script>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Invalid JSON syntax due to conditional leading commas.

The JSON-LD BreadcrumbList has a fatal flaw in how commas are handled around conditionally-rendered items.

When the parent block (lines 18–31) does not render, the resulting JSON is:

"itemListElement": [
  { "position": 1, ... }
  ,{ "position": 2, ... }  // ← Invalid: leading comma after closing brace
]

This violates JSON syntax and will be rejected by search engines. Similarly, when the parent block does render, you get a double-comma sequence (},\n ,{), which is also invalid.

The root cause: The leading commas at lines 23 and 32 assume a preceding element, but JSON requires commas between elements, not before them.

Suggested fix: Move the comma after the Home item and handle the parent conditionally:

   {
     "@type": "ListItem",
     "position": 1,
     "name": "Home",
     "item": "{{ site.url | default: 'https://docs.falkordb.com' }}"
-  }
+  },
   {% assign position = 2 %}
   {% if page.parent %}
   {% comment %}Find parent page{% endcomment %}
   {% assign parent_pages = site.pages | where: "title", page.parent %}
   {% if parent_pages.size > 0 %}
   {% assign parent = parent_pages | first %}
-  ,{
+  {
     "@type": "ListItem",
     "position": {{ position }},
     "name": "{{ parent.title }}",
     "item": "{{ site.url | default: 'https://docs.falkordb.com' }}{{ parent.url }}"
-  }
+  },
    {% assign position = position | plus: 1 %}
   {% endif %}
   {% endif %}
-  ,{
+  {
     "@type": "ListItem",
     "position": {{ position }},
     "name": "{{ page.title }}",
     "item": "{{ site.url | default: 'https://docs.falkordb.com' }}{{ page.url }}"
   }
 ]

This ensures the JSON array maintains valid syntax regardless of whether the parent element is present.

@gkorland
Copy link
Contributor

https://gh.io/copilot-coding-agent-docs apply changes based on #282 (comment)

Copilot finished work on behalf of gkorland November 17, 2025 09:51
Copy link
Contributor Author

Copilot AI commented Nov 17, 2025

https://gh.io/copilot-coding-agent-docs apply changes based on #282 (comment)

The changes from discussion r2533384371 have already been applied in commit e2343f6. The TechArticle schema now includes both datePublished (from page.date) and dateModified (from page.last_modified_at with fallback to page.date) as separate fields as requested.

Copilot finished work on behalf of gkorland November 17, 2025 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants