Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update frontmatter with keywords #14475

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

sean1588
Copy link
Member

@sean1588 sean1588 commented Mar 21, 2025

Generate keywords and add them to frontmatter.

This PR reads in the markdown files and uses the content to generate keywords. It then writes back the frontmatter to the file with keywords list under search.keywords.

1st commit - contains python script used to gen

2nd commit - config just to enable it to generate in the testing environment and push to testing algolia index

3rd commit - some pages for reference to sample some of the results this produces. The results have a bit of a diff due to inconsistent yaml formatting like indentation and spacing. I tried to retain the integrity of this as much as possible but because of inconsistency, looks like there may be no way around it

@pulumi-bot
Copy link
Collaborator

@sean1588 sean1588 force-pushed the sean/add-keywords-frontmatter branch from d8f9e17 to 2efc465 Compare March 21, 2025 01:24
@pulumi-bot
Copy link
Collaborator

@sean1588 sean1588 requested a review from a team as a code owner March 21, 2025 01:26
@sean1588 sean1588 requested a review from mjeffryes March 21, 2025 01:30
@pulumi-bot
Copy link
Collaborator

@sean1588 sean1588 force-pushed the sean/add-keywords-frontmatter branch from 7ff9b33 to 5f08699 Compare March 21, 2025 20:43
Copy link
Member

@mjeffryes mjeffryes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! IT doesn't feel like we're quite there yet as the keywords its picking out seem too general in a lot of cases, but let's keep playing with it!

(It might be interesting to run it on the blog posts too; those pages have a lot of content that probably isn't making it into the index)

- /docs/esc-cli/
search:
keywords:
- esc_env_version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the underscores work in the index? I would have expected "esc env version"

Comment on lines +5 to +9
- esc_env
- environments
- env
- environment
- esc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm this list ends up pretty redundant (I think algolia can handle plurals?)

And it's kind of missing the obvious "list environments" keyword.

- /docs/esc/sdk/
search:
keywords:
- pulumi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably should black list "pulumi" from being used as a keyword

Comment on lines +18 to +21
- esc
- node
- javascript
- typescript
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surprised we don't get "SDK" for this page or "language"

The javascript/typescript keywords also seem pretty weak since they are so general.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants