Skip to content

Conversation

@jackschefer-msft
Copy link
Contributor

This change fixes #2257, where PCIe's hacky MMIO CRS allocation logic can conflict with RAM in the memory map or non-PCIe MMIO gaps used for VMBUS devices.

This change includes:

  • Support for "device reserved" regions in the memory config and memory layout
  • Moving PCIe CRS reservation from the VM worker to the control process, alongside MMIO gap selection
  • Adding the PNP0A03 _CID to the SSDT which is needed for Linux to honor the _CRS allocation (otherwise it just finds free address space like the VMBUS gaps)
  • Bumping the default low CRS size per PCIe root complex to 64MB, so the default is sufficient for Linux to enumerate a couple NVMe controllers

@jackschefer-msft jackschefer-msft requested a review from a team October 29, 2025 19:35
@jackschefer-msft jackschefer-msft requested review from a team as code owners October 29, 2025 19:35
Copilot AI review requested due to automatic review settings October 29, 2025 19:35
@github-actions github-actions bot added the unsafe Related to unsafe code label Oct 29, 2025
@github-actions
Copy link

⚠️ Unsafe Code Detected

This PR modifies files containing unsafe Rust code. Extra scrutiny is required during review.

For more on why we check whole files, instead of just diffs, check out the Rustonomicon

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for device-reserved memory regions in the memory layout system. It introduces a new device_reserved memory type to differentiate PCIe MMIO allocations from general MMIO gaps, enabling better memory management for PCI Express root complexes.

Key Changes

  • Introduces device_reserved field and AddressType::DeviceReserved to track PCIe-allocated memory regions separately from general MMIO gaps
  • Updates MemoryLayout::new() signature to accept device-reserved ranges and validates them alongside MMIO gaps
  • Modifies PCIe root complex configuration to pre-allocate MMIO ranges from the control process and pass them as MemoryRange instead of sizes
  • Changes default PCIe low MMIO window size from 4MB to 64MB

Reviewed Changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
vm/vmcore/vm_topology/src/memory.rs Core implementation: adds device_reserved field, DeviceReserved address type, updates memory layout construction to handle device-reserved gaps
openvmm/hvlite_defs/src/config.rs Updates PcieRootComplexConfig to store MMIO ranges instead of sizes, adds device_reserved_gaps to MemoryConfig
openvmm/openvmm_entry/src/lib.rs Refactors PCIe MMIO allocation logic to compute ranges upfront and store them in device_reserved_gaps
openvmm/hvlite_core/src/worker/dispatch.rs Updates memory layout creation with device-reserved gaps; validates PCIe MMIO ranges are properly reserved
openvmm/hvlite_core/src/worker/vm_loaders/igvm.rs Passes device-reserved gaps to VTL2 memory range calculation
openhcl/virt_mshv_vtl/src/processor/hardware_cvm/mod.rs Treats DeviceReserved address type like MMIO for access emulation
vm/acpi/src/ssdt.rs Adds _CID compatibility identifier to PCIe device ACPI definition
openvmm/openvmm_entry/src/cli_args.rs Updates documentation and defaults for PCIe low MMIO size (4MB → 64MB)
vm/devices/net/net_backend/src/tests.rs Updates test call with empty device-reserved array
vmm_core/src/acpi_builder.rs Updates test call with empty device-reserved array
tmk/tmk_vmm/src/run.rs Updates MemoryLayout::new() call with empty device-reserved array
openhcl/underhill_mem/src/registrar.rs Updates test call with empty device-reserved array
openvmm/openvmm_entry/src/ttrpc/mod.rs Adds empty device_reserved_gaps to memory configuration
openvmm/openvmm_entry/Cargo.toml Adds memory_range dependency
Cargo.lock Updates dependency graph to include memory_range in openvmm_entry
Comments suppressed due to low confidence (1)

vm/vmcore/vm_topology/src/memory.rs:4

  • Corrected grammatical error: 'Tools to the compute' should be 'Tools to compute the' or 'Tools for computing the'.
//! Tools to the compute guest memory layout.

Ram,
/// The address describes mmio.
Mmio,
/// The address describes a device reserved region.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add some info on what these generally are?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or i'm wondering, should we strongly type this for what these explicitly are for (pci?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take is no, that we should decouple the address space layout from how that layout is used. I think when the next device needing reserved address space comes along (virtio-mmio and vSMMU come to mind) it would be a shame to have to add a new layout types for those. This primitive is just "don't put anything else here"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, but there are cases where we want to select ranges that we need to know all the ranges in the system right? I'm not sure I agree - if we have to manage between multiple different disjoint places, what's used and what's free in the address space, aren't we back to the same problem we had before this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I follow, the MemoryLayout should know all the ranges that are used or available (it must so that it can place RAM without GPA overlaps). All I'm saying is that it doesn't need to know the particular purpose of every single one of those ranges, it can treat many types of ranges (I am thinking the virtio-mmio bus, SMMU control registers, really any ACPI device that has CRS) identically for layout purposes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while for layout purposes I agree that all we need to know is "is this address range used?" that doesn't help debugging/inspection output. I'd rather us be explicit when we know what a range is for - why erase that information?

}

validate_ranges(gaps)?;
validate_ranges(mmio_gaps)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything to gain from validating them separately if we're just going to validate them together later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is where control-process-provided MemoryConfig validation occurs, and we specifically test that such misconfiguration fails, which is why I have not changed these precondition checks in this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

unsafe Related to unsafe code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

openvmm - PCIe attached NVMe emulator and virtio vpci conflict

3 participants