Skip to content

CGROUP aware resource monitor on memory #38718

@wujiaqi

Description

@wujiaqi

Title: Add a GROUP aware resource monitor for memory

Description:
I'm opening this issue to have a preliminary discussion on how to implement this. Someone on my team can do the implementation once we get agreement.

We have an Istio Ingress Gateway today and have overload manager configured to load shed on memory utilization thresholds. This is to prevent OOMKills of our pods especially during high load events. However the fixed_heap resource monitor that exists today only reports the memory that tcmalloc believes is allocated. OOMKills are based on what the OS sees and not what tcmalloc thinks so it is important to have a monitor that sees this accordingly. It is often the case that fixed_heap is substantially lower than what is reported in CGROUPS.

Below is an experiment I conducted to demonstrate the discrepancy

During Load
Docker stats

CONTAINER ID   NAME      CPU %     MEM USAGE / LIMIT   MEM %
2696a94996b9   envoy     50.56%     489.5MiB / 512MiB  95.61%

Envoy metric

overload.envoy.resource_monitors.fixed_heap.pressure: 87

After Load
Docker stats

CONTAINER ID   NAME      CPU %     MEM USAGE / LIMIT   MEM %
2696a94996b9   envoy     0.48%     343.1MiB / 512MiB   67.01%

Envoy metric

overload.envoy.resource_monitors.fixed_heap.pressure: 16

As you can see, heap pressure is much lower than the OS reported memory consumption.

I am proposing to add a new resource monitor for memory based on CGROUPS rather than tcmalloc stats. As there is a transition at the moment where some systems are CGROUPS v1 and others are CGROUPS v2, and some could be in hybrid mode, it would be worth abstracting this detail away in the configuration to just "cgroups enabled". During object construction we can detect in the system if it is CGROUPS v1 or v2. For example it can check the filesystem for presence of the hierarchies

if the following files are present then system is on cgroups v2

  • /sys/fs/cgroup/memory.max
  • /sys/fs/cgroup/memory.current

else if the following directory exists then system is on cgroups v1

  • /sys/fs/cgroup/memory

We will pick the highest available cgroups implementation on the system during construction.

Appreciate the feedback, thanks.

[optional Relevant Links:]

Any extra documentation required to understand the issue.
related issue #36681

cc @ramaraochavali

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions