Skip to content

Resource-aware VM scheduling (t-shirt sizes or host-capability autodetection) #24

Description

@Zorlin

Motivated by growing the london k3s cluster to a heterogenous fleet (see riffcc/infra) — today each VM's memory/cores and the Proxmox node it lands on are hand-specified per-host in inventory, and placing N workers across nodes of wildly different sizes (e.g. 4-core/31G boxes alongside 32-core/121G boxes) is a manual packing problem.

Proposal

Let the proxmox_vm provisioner size + place VMs declaratively, two complementary modes:

  1. T-shirt sizes (cloud/AWS-style) — a named size maps to memory/cores, e.g. size: large → 32G/16c. Sites define the size catalog (inventory or role defaults).
  2. Host-capability autodetection — given a per-VM spec (e.g. 4 vCPU / 12G) and a target packing factor (e.g. 4 per host), query each candidate Proxmox node's actual maxcpu/maxmem/free capacity and pack accordingly. So a 16c/32t/96G host auto-hosts 4× (4vCPU/12G) workers, while a 4c/31G host hosts 1 — no per-VM hand-tuning.

Why

Heterogenous and hyperconverged deployments: one declarative spec, the engine packs across whatever nodes exist. Removes the manual placement step (and the risk of overcommitting a small node or underusing a big one).

Out of scope here

This issue is the feature request only — not implementing it yet. The current infra work hand-places workers (12G/4c default, 32G/16c on the big nodes) as a stopgap.

Refs: proxmox_vm provisioner (src/provisioners/proxmox_vm.rs), ProvisionConfig memory/cores/node fields.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions