Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a blog post for windows memory pressure eviction #48535

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

marosset
Copy link
Contributor

@marosset marosset commented Oct 24, 2024

Description

Support for memory-pressure based eviction for Windows nodes was added in K8s v1.31.
Documentation on k8s.io website was added with #47306 but would also like to publish a blog post to cover some of the more Windows specific behaviors

Issue

/sig windows

@k8s-ci-robot k8s-ci-robot added the sig/windows Categorizes an issue or PR as relevant to SIG Windows. label Oct 24, 2024
@k8s-ci-robot k8s-ci-robot added the area/blog Issues or PRs related to the Kubernetes Blog subproject label Oct 24, 2024
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 24, 2024
@marosset
Copy link
Contributor Author

/assign @jsturtevant for technical review

Copy link

netlify bot commented Oct 24, 2024

Pull request preview available for checking

Built without sensitive environment variables

Name Link
🔨 Latest commit 8e12b21
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-io-main-staging/deploys/673d1d270520da000872f6ba
😎 Deploy Preview https://deploy-preview-48535--kubernetes-io-main-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. To make life easier for blog reviewers, can you look at https://kubernetes.io/docs/contribute/style/style-guide/ and adopt any guidance that is obviously also relevant for blog articles?

For example, headings should be sentence case, and new concepts should be italicized.


In Kubernetes 1.31, support for memory-pressure based eviction was added for Windows nodes.

This post describes what these changes mean for you and how they may differ from usage on Linux nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I parse this as "This post describes what these changes mean for you and how these changes may differ from usage on Linux nodes.". If "they" means something else like "Windows node usage", please be explicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-worded the summary - please take another look @bridgetkromhout - thanks!


## Additional Resources

To gain a deeper understanding of how memory is managed on Windows, you may also find this article on [Physical and Virtual Memory in Windows 10](https://answers.microsoft.com/en-us/windows/forum/all/physical-and-virtual-memory-in-windows-10/e36fb5bc-9ac8-49af-951c-e7d39b979938) helpful.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that Windows 11 is available, does this Windows 10 article from 2015 apply to the newer version also? Or is it scoped only to 10, and maybe 11 has some other info that we need to point to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still relevant and is the most informative article I could find on the topic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is one detailed blog post 😮


Additionally, exposing the Commit usage for individial Pods and Containers through the `/stats/summary` API is being worked on as part of the [cAdvisor-les, CRI-full Container and Pod Stats](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2371-cri-pod-container-stats/README.md) enhancement.

[windows-exporter](https://github.com/prometheus-community/windows_exporter) can also be configured to monitor and report on Commit memory usage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe mention it works for containers and node level today if that information is required now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - PTAL

Copy link
Contributor

@jsturtevant jsturtevant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall content looks good from sig-windows perspective

Copy link
Member

@ArvindParekh ArvindParekh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few formatting and grammatical errors :)

@marosset
Copy link
Contributor Author

Hi. To make life easier for blog reviewers, can you look at https://kubernetes.io/docs/contribute/style/style-guide/ and adopt any guidance that is obviously also relevant for blog articles?

For example, headings should be sentence case, and new concepts should be italicized.

Thanks @sftim - I made some updates so this blog post will better adhere to the style guides.

@jsturtevant
Copy link
Contributor

/lgtm
for sig-windows, pending the small format updates

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 25, 2024
Comment on lines 52 to 54
Currently, `kubectl top` continues to report Working Set memory. As a result, a Pod may be evicted for exceeding its memory requests or encounter memory allocation failures due to exceeding its `CommitLimit`, even when the _WorkingSet_ memory usage appears to be well within the expected bounds. This discrepancy can lead to confusion.

Starting with v1.31, Windows nodes now report the global `CommitTotal` and `CommitLimit` for each node through the `/stats/summary` API endpoint. This information is available under the `windows-global-commit-memory` syscontainer API resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks out of date.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part specifically?
I don't think this is out of date.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 28, 2024
@tengqm
Copy link
Contributor

tengqm commented Oct 30, 2024

/lgtm
/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Oct 30, 2024
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 30, 2024
@marosset
Copy link
Contributor Author

marosset commented Nov 1, 2024

/hold
I'm not sure what the process for assigning a release date is for blogs but this one still has TBD

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 1, 2024
Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @marosset

  • Please set the publication date to Tuesday 11th December 2024 (we'll update that after it merges). This is a change to the article path in source and to the front matter.
  • Please add draft: true to the page front matter

(nit)
Ideally, also avoid writing component names such as kubelet in anything other than all-lowercase.

@sftim
Copy link
Contributor

sftim commented Nov 11, 2024

/lgtm cancel

OK to LGTM this if it has a future publication date agreed with release comms, or it it's marked as draft and the publication date is set to the planned release date for K8s 1.32

Otherwise, reviewers should hold off there.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 11, 2024
@k8s-ci-robot k8s-ci-robot requested a review from tengqm November 11, 2024 17:50
@marosset
Copy link
Contributor Author

Thanks @marosset

  • Please set the publication date to Tuesday 11th December 2024 (we'll update that after it merges). This is a change to the article path in source and to the front matter.
  • Please add draft: true to the page front matter

(nit) Ideally, also avoid writing component names such as kubelet in anything other than all-lowercase.

ack - i'll make some updates early next week (after kubecon) and address all these comments

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from sftim. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@marosset
Copy link
Contributor Author

@sftim - I made the following updates

  • renamed 1 instance of Kubelet to kubelet,
  • updated the arcticle name in source
  • set draft: true in the front matter
  • set publication date to 2024-12-11 in the front matter.
    Let me know if there is anything else needed!

@marosset
Copy link
Contributor Author

marosset commented Dec 5, 2024

@sftim / @tengqm - Is there any other action required from me for this?
Thanks!

@sftim
Copy link
Contributor

sftim commented Dec 5, 2024

I don't see this on https://github.com/orgs/kubernetes/projects/195/views/4

The advice in #48535 (comment) was conditional on this being accepted for the post-release comms series. @marosset can you liaise there (#release-comms on Slack)? It may be that this becomes an informal article that goes out after the formal post release comms.

@marosset
Copy link
Contributor Author

marosset commented Dec 5, 2024

I don't see this on https://github.com/orgs/kubernetes/projects/195/views/4

The advice in #48535 (comment) was conditional on this being accepted for the post-release comms series. @marosset can you liaise there (#release-comms on Slack)? It may be that this becomes an informal article that goes out after the formal post release comms.

I think it would make more sense for this to go out as an informal article.
This is going more in-depth into work that was done in v1.31 and prior, not 1.32 / the current release.
Is there anything that needs to happen if we go that route?

Comment on lines +2 to +8
layout: blog
title: "Windows Memory Pressure Eviction"
slug: windows-memory-pressure-eviction
date: 2024-12-11
draft: true
author: >
Mark Rossetti (Microsoft)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, try this:

Suggested change
layout: blog
title: "Windows Memory Pressure Eviction"
slug: windows-memory-pressure-eviction
date: 2024-12-11
draft: true
author: >
Mark Rossetti (Microsoft)
layout: blog
title: "Windows Memory Pressure Eviction"
slug: windows-memory-pressure-eviction
date: 2024-12-09
author: >
Mark Rossetti (Microsoft)

and change the path in the front matter to match the new date.


Starting with v1.31, Windows nodes report the global `CommitTotal` and `CommitLimit` for each node through the `/stats/summary` API endpoint. This information is available under the `windows-global-commit-memory` named object in the `systemContainers` collection:

```json=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use console here because this is not a JSON document


Understanding the distinction between _Working Set_ and _Commit_ memory and being able to monitor both is critical for maintaining the stability of individual Pod workloads and the entire Windows node.

Currently, `kubectl top` continues to report Working Set memory. As a result, a Pod may be evicted for exceeding its memory requests or encounter memory allocation failures due to exceeding its `CommitLimit`, even when the _Working Set_ memory usage appears to be well within the expected bounds. This discrepancy can lead to confusion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean:

Suggested change
Currently, `kubectl top` continues to report Working Set memory. As a result, a Pod may be evicted for exceeding its memory requests or encounter memory allocation failures due to exceeding its `CommitLimit`, even when the _Working Set_ memory usage appears to be well within the expected bounds. This discrepancy can lead to confusion.
Before v1.31, the values you saw from `kubectl top` reported Working Set memory. As a result, a Pod could have been evicted for exceeding its memory requests, or a Pod might have encountered memory allocation failures (due to exceeding its `CommitLimit`) - and this could happen even when the _Working Set_ memory usage appeared to be well within the expected bounds. This discrepancy could have led to confusion.

It's not clear if v1.31 improved things.

@sftim
Copy link
Contributor

sftim commented Dec 6, 2024

To be clear: this won't be a post-release blog article. I'd got mixed up and thought this was part of formal post-release comms.

@sftim
Copy link
Contributor

sftim commented Dec 10, 2024

/hold

It'll need a new release date and we'll need to update this to avoid people inferring that the feature was part of the v1.32 release.

Sorry about the wait on reviews here.

OK to unhold once a publication date is agreed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/blog Issues or PRs related to the Kubernetes Blog subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. language/en Issues or PRs related to English language sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
Status: Requires update
Status: No status
Development

Successfully merging this pull request may close these issues.

7 participants