Skip to content

Conversation

jlevon
Copy link
Contributor

@jlevon jlevon commented Jun 25, 2025

No description provided.

bonzini and others added 30 commits June 5, 2025 20:24
Rust makes the current file available as a statically-allocated string,
but without a NUL terminator.  Allow this by storing an optional maximum
length in the Error.

Reviewed-by: Zhao Liu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
The function name is not available in Rust, so make it optional.

Reviewed-by: Zhao Liu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Provide an implementation of std::error::Error that bridges the Rust
anyhow::Error and std::panic::Location types with QEMU's Error*.

It also has several utility methods, analogous to error_propagate(),
that convert a Result into a return value + Error** pair.  One important
difference is that these propagation methods *panic* if *errp is NULL,
unlike error_propagate() which eats subsequent errors[1].  The reason
for this is that in C you have an error_set*() call at the site where
the error is created, and calls to error_propagate() are relatively rare.

In Rust instead, even though these functions do "propagate" a
qemu_api::Error into a C Error**, there is no error_setg() anywhere that
could check for non-NULL errp and call abort().  error_propagate()'s
behavior of ignoring subsequent errors is generally considered weird,
and there would be a bigger risk of triggering it from Rust code.

[1] This is actually a violation of the preconditions of error_propagate(),
    so it should not happen.  But you never know...

Reviewed-by: Zhao Liu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: Zhao Liu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Remove the need to convert after every read of the BqlCell.  Because the
vmstate uses a u8 as the size of the VARRAY, this requires switching
the VARRAY to use num_timers_save; which in turn requires ensuring that
the num_timers_save is always there.  For simplicity do this by
removing support for version 1, which QEMU has not been producing for
~15 years.

Reviewed-by: Zhao Liu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
No functional change intended.

Suggested-by: Zhao Liu <[email protected]>
Reviewed-by: Zhao Liu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Do not silently adjust num_timers, and fail if intcap is 0.

Reviewed-by: Markus Armbruster <[email protected]>
Reviewed-by: Zhao Liu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Match the code in hpet.c; this also allows removing the
BqlCell from the num_timers field.

Reviewed-by: Zhao Liu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Now that the num_timers field is initialized as a property, someone may
change its default value using qdev_prop_set_uint8(), but the value is
fixed after the Rust code sees it first.  Since there is no need to modify
it after realize(), it is not to be necessary to have a BqlCell wrapper.

Signed-off-by: Zhao Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[Remove .into() as well. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
error is new; offset_of is gone.

Reviewed-by: Zhao Liu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
If the enum includes values such as "Ok", "Err", or "Error", the TryInto
macro can cause errors.  Be careful and qualify identifiers with the full
path, or in the case of TryFrom<>::Error do not use the associated type
at all.

Signed-off-by: Paolo Bonzini <[email protected]>
A page state change is typically followed by an access of the page(s) and
results in another VMEXIT in order to map the page into the nested page
table. Depending on the size of page state change request, this can
generate a number of additional VMEXITs. For example, under SNP, when
Linux is utilizing lazy memory acceptance, memory is typically accepted in
4M chunks. A page state change request is submitted to mark the pages as
private, followed by validation of the memory. Since the guest_memfd
currently only supports 4K pages, each page validation will result in
VMEXIT to map the page, resulting in 1024 additional exits.

When performing a page state change, invoke KVM_PRE_FAULT_MEMORY for the
size of the page state change in order to pre-map the pages and avoid the
additional VMEXITs. This helps speed up boot times.

Signed-off-by: Tom Lendacky <[email protected]>
Link: https://lore.kernel.org/r/f5411c42340bd2f5c14972551edb4e959995e42b.1743193824.git.thomas.lendacky@amd.com
Signed-off-by: Paolo Bonzini <[email protected]>
futex(2) - Linux manual page
https://man7.org/linux/man-pages/man2/futex.2.html
> Note that a wake-up can also be caused by common futex usage patterns
> in unrelated code that happened to have previously used the futex
> word's memory location (e.g., typical futex-based implementations of
> Pthreads mutexes can cause this under some conditions).  Therefore,
> callers should always conservatively assume that a return value of 0
> can mean a spurious wake-up, and use the futex word's value (i.e.,
> the user-space synchronization scheme) to decide whether to continue
> to block or not.

Signed-off-by: Akihiko Odaki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
Windows supports futex-like APIs since Windows 8 and Windows Server
2012.

Signed-off-by: Akihiko Odaki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
scripts/checkpatch.pl warns for __linux__ saying "architecture specific
defines should be avoided".

Signed-off-by: Akihiko Odaki <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
qemu-thread used to abstract pthread primitives into futex for the
QemuEvent implementation of POSIX systems other than Linux. However,
this abstraction has one key difference: unlike futex, pthread
primitives require an explicit destruction, and it must be ordered after
wait and wake operations.

It would be easier to perform destruction if a wait operation ensures
the corresponding wake operation finishes as POSIX semaphore does, but
that requires to protect state accesses in qemu_event_set() and
qemu_event_wait() with a mutex. On the other hand, real futex does not
need such a protection but needs complex barrier and atomic operations
to ensure ordering between the two functions.

Add special implementations of qemu_event_set() and qemu_event_wait()
using pthread primitives. qemu_event_wait() will ensure qemu_event_set()
finishes, and these functions will avoid complex barrier and atomic
operations to ensure ordering between them.

Signed-off-by: Akihiko Odaki <[email protected]>
Tested-by: Phil Dennis-Jordan <[email protected]>
Reviewed-by: Phil Dennis-Jordan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
Use the futex-based implementation of QemuEvent on Windows to
remove code duplication and remove the overhead of event object
construction and destruction.

Signed-off-by: Akihiko Odaki <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
This unlocks the futex-based implementation of QemuLockCnt to Windows.

Signed-off-by: Akihiko Odaki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
Document QemuEvent to help choose an appropriate synchronization
primitive.

Signed-off-by: Akihiko Odaki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
pause_event can utilize qemu_event_reset() to discard events.

Signed-off-by: Akihiko Odaki <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
colo_exit_sem and colo_incoming_sem represent one-shot events so they
can be converted into QemuEvent, which is more lightweight.

Signed-off-by: Akihiko Odaki <[email protected]>
Reviewed-by: Fabiano Rosas <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
thread_sync_sem is an one-shot event so it can be converted into
QemuEvent, which is more lightweight.

Signed-off-by: Akihiko Odaki <[email protected]>
Reviewed-by: Fabiano Rosas <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
sem in AppleGFXReadMemoryJob is an one-shot event so it can be converted
into QemuEvent, which is more specialized for such a use case.

Signed-off-by: Akihiko Odaki <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
The Intel SDM section 10.2.3.3 on the MXCSR.FTZ bit says that we
flush outputs to zero when we detect underflow, which is after
rounding.  Set the detect_ftz flag accordingly.

This allows us to enable the test in fma.c which checks this
behaviour.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Zhao Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
The softfloat get_float_exception_flags() function returns 'int', but
in various places in target/i386 we incorrectly store the returned
value into a uint8_t.  This currently has no ill effects because i386
doesn't care about any of the float_flag enum values above 0x40.
However, we want to start using float_flag_input_denormal_used, which
is 0x4000.

Switch to using 'int' so that we can handle all the possible valid
float_flag_* values. This includes changing the return type of
save_exception_flags() and the argument to merge_exception_flags().

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Reviewed-by: Zhao Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
The x86 DE bit in the FPU and MXCSR status is supposed to be set
when an input denormal is consumed. We didn't previously report
this from softfloat, so the x86 code either simply didn't set
the DE bit or else incorrectly wired it up to denormal_flushed,
depending on which register you looked at.

Now we have input_denormal_used we can wire up these DE bits
with the semantics they are supposed to have.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
Add some fma test cases that check for correct handling of FTZ and
for the flag that indicates that the input denormal was consumed.

Signed-off-by: Peter Maydell <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Zhao Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
… staging

Python Pull Request

Add QAPI and QAPI doc files to python static analysis testing regime,
this time for real, probably

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+ber27ys35W+dsvQfe+BBqr8OQ4FAmhB388ACgkQfe+BBqr8
# OQ6lMA//WJtSr57ADW5k5zcRMxV7k//erYFkjgXbTh7b9DDblMwNVhYr5lqJbEvS
# V5OChW32++QIO5Y4cBhzbzxFTJXbAYzyg3UATCkH2kRbd139bqdAtsnsaFmoHmLP
# c8KAggT1+hIb7JIVkFiFccMsdCeFwXwQoS5Nk7w95H9cxxYUj/O9qbRuCN+elg/e
# mX4zaq6F2umTx0EdD35DlBPrPPyRsdlVWKUqh8f5KaAGPOelGyvbgwrXU2MT7ewG
# JXcRoYzn/9J2KSboiFY0MjIKqDuhoMdCnbSNpRNGgClJRa+VZEBPFClMe1YSXw0m
# J3kQMYeqm5S1GUG+ZrBTICY6Ch8jNq2kb3ua707JJWdYmd9gq0poF/P7gaRVbyAL
# 5UdYVVgtH/3xve2LGe0guj3v5kTK7Vo6dApwj8pRHrBWWOgAG0UgGseOJgndfCIx
# PQRsF2T4YoVdjiGB46EIgBmoFI+VJGwFRlvb6WZ0YmPedi7MuUvWmo0lbgDkaTO+
# MMqsWxShTY+xwnSFgtl1iHOAdfT6jiHcn1n+hZrGpvF492XRjW02zKiDSZECqSz5
# lg51+OaDc2HwS65sYyFb4GD7yF/PcdOj7MG/Ij9dx0GoM9/HmcVAHyRt45QNgxzc
# N7Xx6GFGs7puDoE/pSoauFtGC8XeR6Cx0HfBcXYGaJcJEq6N4yw=
# =IVAr
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 05 Jun 2025 14:19:59 EDT
# gpg:                using RSA key F9B7ABDBBCACDF95BE76CBD07DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <[email protected]>" [full]
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* tag 'python-pull-request' of https://gitlab.com/jsnow/qemu:
  qapi: delete un-needed python static analysis configs
  python: Drop redundant warn_unused_configs = True
  python: add qapi static analysis tests
  python: update missing dependencies from minreqs
  docs/qapidoc: linting fixes
  qapi: Add some pylint ignores

Signed-off-by: Stefan Hajnoczi <[email protected]>
* futex: support Windows
* qemu-thread: Avoid futex abstraction for non-Linux
* migration, hw/display/apple-gfx: replace QemuSemaphore with QemuEvent
* rust: bindings for Error
* hpet, rust/hpet: return errors from realize if properties are incorrect
* rust/hpet: Drop BqlCell wrapper for num_timers
* target/i386: Emulate ftz and denormal flag bits correctly
* i386/kvm: Prefault memory on page state change

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmhC4AgUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroP09wf+K9e0TaaZRxTsw7WU9pXsDoYPzTLd
# F5CkBZPY770X1JW75f8Xw5qKczI0t6s26eFK1NUZxYiDVWzW/lZT6hreCUQSwzoS
# b0wlAgPW+bV5dKlKI2wvnadrgDvroj4p560TS+bmRftiu2P0ugkHHtIJNIQ+byUQ
# sWdhKlUqdOXakMrC4H4wDyIgRbK4CLsRMbnBHBUENwNJYJm39bwlicybbagpUxzt
# w4mgjbMab0jbAd2hVq8n+A+1sKjrroqOtrhQLzEuMZ0VAwocwuP2Adm6gBu9kdHV
# tpa8RLopninax3pWVUHnypHX780jkZ8E7zk9ohaaK36NnWTF4W/Z41EOLw==
# =Vs6V
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 06 Jun 2025 08:33:12 EDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "[email protected]"
# gpg: Good signature from "Paolo Bonzini <[email protected]>" [full]
# gpg:                 aka "Paolo Bonzini <[email protected]>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (31 commits)
  tests/tcg/x86_64/fma: add test for exact-denormal output
  target/i386: Wire up MXCSR.DE and FPUS.DE correctly
  target/i386: Use correct type for get_float_exception_flags() values
  target/i386: Detect flush-to-zero after rounding
  hw/display/apple-gfx: Replace QemuSemaphore with QemuEvent
  migration/postcopy: Replace QemuSemaphore with QemuEvent
  migration/colo: Replace QemuSemaphore with QemuEvent
  migration: Replace QemuSemaphore with QemuEvent
  qemu-thread: Document QemuEvent
  qemu-thread: Use futex if available for QemuLockCnt
  qemu-thread: Use futex for QemuEvent on Windows
  qemu-thread: Avoid futex abstraction for non-Linux
  qemu-thread: Replace __linux__ with CONFIG_LINUX
  futex: Support Windows
  futex: Check value after qemu_futex_wait()
  i386/kvm: Prefault memory on page state change
  rust: make TryFrom macro more resilient
  docs: update Rust module status
  rust/hpet: Drop BqlCell wrapper for num_timers
  rust/hpet: return errors from realize if properties are incorrect
  ...

Signed-off-by: Stefan Hajnoczi <[email protected]>
Qiangcy and others added 28 commits June 23, 2025 16:03
…result

Modify memory_region_set_ram_discard_manager() to return -EBUSY if a
RamDiscardManager is already set in the MemoryRegion. The caller must
handle this failure, such as having virtio-mem undo its actions and fail
the realize() process. Opportunistically move the call earlier to avoid
complex error handling.

This change is beneficial when introducing a new RamDiscardManager
instance besides virtio-mem. After
ram_block_coordinated_discard_require(true) unlocks all
RamDiscardManager instances, only one instance is allowed to be set for
one MemoryRegion at present.

Suggested-by: David Hildenbrand <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]>
Tested-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: Xiaoyao Li <[email protected]>
Signed-off-by: Chenyi Qiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
…rd()

Update ReplayRamDiscard() function to return the result and unify the
ReplayRamPopulate() and ReplayRamDiscard() to ReplayRamDiscardState() at
the same time due to their identical definitions. This unification
simplifies related structures, such as VirtIOMEMReplayData, which makes
it cleaner.

Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]>
Reviewed-by: Xiaoyao Li <[email protected]>
Signed-off-by: Chenyi Qiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
… with guest_memfd

Commit 852f004 ("RAMBlock: make guest_memfd require uncoordinated
discard") highlighted that subsystems like VFIO may disable RAM block
discard. However, guest_memfd relies on discard operations for page
conversion between private and shared memory, potentially leading to
the stale IOMMU mapping issue when assigning hardware devices to
confidential VMs via shared memory. To address this and allow shared
device assignement, it is crucial to ensure the VFIO system refreshes
its IOMMU mappings.

RamDiscardManager is an existing interface (used by virtio-mem) to
adjust VFIO mappings in relation to VM page assignment. Effectively page
conversion is similar to hot-removing a page in one mode and adding it
back in the other. Therefore, similar actions are required for page
conversion events. Introduce the RamDiscardManager to guest_memfd to
facilitate this process.

Since guest_memfd is not an object, it cannot directly implement the
RamDiscardManager interface. Implementing it in HostMemoryBackend is
not appropriate because guest_memfd is per RAMBlock, and some RAMBlocks
have a memory backend while others do not. Notably, virtual BIOS
RAMBlocks using memory_region_init_ram_guest_memfd() do not have a
backend.

To manage RAMBlocks with guest_memfd, define a new object named
RamBlockAttributes to implement the RamDiscardManager interface. This
object can store the guest_memfd information such as the bitmap for
shared memory and the registered listeners for event notifications. A
new state_change() helper function is provided to notify listeners, such
as VFIO, allowing VFIO to do dynamically DMA map and unmap for the shared
memory according to conversion events. Note that in the current context
of RamDiscardManager for guest_memfd, the shared state is analogous to
being populated, while the private state can be considered discarded for
simplicity. In the future, it would be more complicated if considering
more states like private/shared/discarded at the same time.

In current implementation, memory state tracking is performed at the
host page size granularity, as the minimum conversion size can be one
page per request. Additionally, VFIO expected the DMA mapping for a
specific IOVA to be mapped and unmapped with the same granularity.
Confidential VMs may perform partial conversions, such as conversions on
small regions within a larger one. To prevent such invalid cases and
until support for DMA mapping cut operations is available, all
operations are performed with 4K granularity.

In addition, memory conversion failures cause QEMU to quit rather than
resuming the guest or retrying the operation at present. It would be
future work to add more error handling or rollback mechanisms once
conversion failures are allowed. For example, in-place conversion of
guest_memfd could retry the unmap operation during the conversion from
shared to private. For now, keep the complex error handling out of the
picture as it is not required.

Tested-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]>
Signed-off-by: Chenyi Qiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[peterx: squash fixup from Chenyi to fix builds]
Signed-off-by: Peter Xu <[email protected]>
A new field, attributes, was introduced in RAMBlock to link to a
RamBlockAttributes object, which centralizes all guest_memfd related
information (such as fd and status bitmap) within a RAMBlock.

Create and initialize the RamBlockAttributes object upon ram_block_add().
Meanwhile, register the object in the target RAMBlock's MemoryRegion.
After that, guest_memfd-backed RAMBlock is associated with the
RamDiscardManager interface, and the users can execute RamDiscardManager
specific handling. For example, VFIO will register the
RamDiscardListener and get notifications when the state_change() helper
invokes.

As coordinate discarding of RAM with guest_memfd is now supported, only
block uncoordinated discard.

Tested-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: Alexey Kardashevskiy <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Signed-off-by: Chenyi Qiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
Currently we have a short paragraph saying that patches must include
a Signed-off-by line, and merely link to the kernel documentation.
The linked kernel docs have a lot of content beyond the part about
sign-off an thus are misleading/distracting to QEMU contributors.

This introduces a dedicated 'code-provenance' page in QEMU talking
about why we require sign-off, explaining the other tags we commonly
use, and what to do in some edge cases.

Signed-off-by: Daniel P. Berrangé <[email protected]>
Reviewed-by: Peter Maydell <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Markus Armbruster <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
Files contributed to QEMU are generally expected to be provided in the
preferred format for manipulation. IOW, we generally don't expect to
have generated / compiled code included in the tree, rather, we expect
to run the code generator / compiler as part of the build process.

There are some obvious exceptions to this seen in our existing tree, the
biggest one being the inclusion of many binary firmware ROMs. A more
niche example is the inclusion of a generated eBPF program. Or the CI
dockerfiles which are mostly auto-generated. In these cases, however,
the preferred format source code is still required to be included,
alongside the generated output.

Tools which perform user defined algorithmic transformations on code are
not considered to be "code generators". ie, we permit use of coccinelle,
spell checkers, and sed/awk/etc to manipulate code. Such use of automated
manipulation should still be declared in the commit message.

One off generators which create a boilerplate file which the author then
fills in, are acceptable if their output has clear copyright and license
status. This could be where a contributor writes a throwaway python
script to automate creation of some mundane piece of code for example.

Signed-off-by: Daniel P. Berrangé <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Markus Armbruster <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
There has been an explosion of interest in so called AI code
generators. Thus far though, this is has not been matched by a broadly
accepted legal interpretation of the licensing implications for code
generator outputs. While the vendors may claim there is no problem and
a free choice of license is possible, they have an inherent conflict
of interest in promoting this interpretation. More broadly there is,
as yet, no broad consensus on the licensing implications of code
generators trained on inputs under a wide variety of licenses

The DCO requires contributors to assert they have the right to
contribute under the designated project license. Given the lack of
consensus on the licensing of AI code generator output, it is not
considered credible to assert compliance with the DCO clause (b) or (c)
where a patch includes such generated code.

This patch thus defines a policy that the QEMU project will currently
not accept contributions where use of AI code generators is either
known, or suspected.

These are early days of AI-assisted software development. The legal
questions will be resolved eventually. The tools will mature, and we
can expect some to become safely usable in free software projects.
The policy we set now must be for today, and be open to revision. It's
best to start strict and safe, then relax.

Meanwhile requests for exceptions can also be considered on a case by
case basis.

Signed-off-by: Daniel P. Berrangé <[email protected]>
Reviewed-by: Kevin Wolf <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Signed-off-by: Markus Armbruster <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
…rx/qemu into staging

Migration / Memory pull

- Yanfei's optimization to skip log_clear during completion
- Fabiano's cleanup to remove leftover migration-helpers.c file
- Juraj's vnc fix on display pause after migration
- Jaehoon's cpr test fix on possible race of server establishment
- Chenyi's initial support on vfio enablement for guest-memfd

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCaFmzWhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbWYQD/dz08tyaL2J4EHESfBsW4Z1rEggVOM0cB
# hlXnvzf/Pb4A/0X3Hn18bOxfPAZOr8NggS5AKgzCCYVeQEWQA2Jj8hwC
# =kcTN
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 23 Jun 2025 16:04:42 EDT
# gpg:                using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg:                issuer "[email protected]"
# gpg: Good signature from "Peter Xu <[email protected]>" [full]
# gpg:                 aka "Peter Xu <[email protected]>" [full]
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D  D1A9 3B5F CCCD F3AB D706

* tag 'migration-staging-pull-request' of https://gitlab.com/peterx/qemu:
  physmem: Support coordinated discarding of RAM with guest_memfd
  ram-block-attributes: Introduce RamBlockAttributes to manage RAMBlock with guest_memfd
  memory: Unify the definiton of ReplayRamPopulate() and ReplayRamDiscard()
  memory: Change memory_region_set_ram_discard_manager() to return the result
  memory: Export a helper to get intersection of a MemoryRegionSection with a given range
  migration: Don't sync volatile memory after migration completes
  tests/migration: Setup pre-listened cpr.sock to remove race-condition.
  migration: Support fd-based socket address in cpr_transfer_input
  ui/vnc: Update display update interval when VM state changes to RUNNING
  tests/qtest: Remove migration-helpers.c
  migration/ram: avoid to do log clear in the last round

Signed-off-by: Stefan Hajnoczi <[email protected]>
… staging

linux-user: fix resource leaks in gen-vdso
tcg: Add ptr+ofs alternatives to some gvec functions

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmhZ/LMdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8aCggAtZOamQ0+EMe09u9d
# slaeZDlmxHYfb4RXJQasIBi/uHoWY1bFCEWqLnjU41cpNqI7B3yihbS/YQzyI1i/
# fqjATmuhDzer7rZfdtmRdiLi6kY9SuN9tcSVMVU/kxixByPxdYspQBO8hAAQMM1X
# ZY5MIR/5nEMN/U0QUMuqd3krsxzglGQl9Dn610ddVGfzluSCKLLMS/m92gaJmz0u
# xoLTM29lfdtIA29JPpVY+1X8NJ/vTUeBvy2eXUGHjT11rHsYUzMVGCGbzCLluEzN
# V3L/aSkiwrV+wW5M7R6+hySQl65ZVRV+E9BHuln9aDnG4jdzT3conohg2cY9a5jw
# m3HqnQ==
# =U6ub
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 23 Jun 2025 21:17:39 EDT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "[email protected]"
# gpg: Good signature from "Richard Henderson <[email protected]>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20250623' of https://gitlab.com/rth7680/qemu:
  linux-user: fix resource leaks in gen-vdso
  linux-user/aarch64: Update hwcap bits from 6.14
  tcg: Split out tcg_gen_gvec_dup_imm_var
  tcg: Split out tcg_gen_gvec_{add,sub}_var
  tcg: Split out tcg_gen_gvec_mov_var
  tcg: Split out tcg_gen_gvec_3_var
  tcg: Split out tcg_gen_gvec_2_var
  tcg: Add base arguments to check_overlap_[234]
  tcg: Add dbase argument to expand_clr
  tcg: Add dbase argument to do_dup
  tcg: Add dbase argument to do_dup_store

Signed-off-by: Stefan Hajnoczi <[email protected]>
Introduce basic plumbing for vfio-user with CONFIG_VFIO_USER.

We introduce VFIOUserContainer in hw/vfio-user/container.c, which is a
container type for the "IOMMU" type "vfio-iommu-user", and share some
common container code from hw/vfio/container.c.

Add hw/vfio-user/pci.c for instantiating VFIOUserPCIDevice objects,
sharing some common code from hw/vfio/pci.c.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
Introduce the vfio-user "proxy": this is the client code responsible for
sending and receiving vfio-user messages across the control socket.

The new files hw/vfio-user/proxy.[ch] contain some basic plumbing for
managing the proxy; initialize the proxy during realization of the
VFIOUserPCIDevice instance.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
Add the basic implementation for receiving vfio-user messages from the
control socket.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
Add plumbing for sending vfio-user messages on the control socket.
Add initial version negotation on connection.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: John Levon <[email protected]>
Add support for getting basic device information.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
Add support for getting region info for vfio-user. As vfio-user has one
fd per region, enable ->use_region_fds.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
Re-use PCI setup functions from hw/vfio/pci.c to realize the vfio-user
PCI device.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
IRQ setup uses the same semantics as the traditional vfio path, but we
need to share the corresponding file descriptors with the server as
necessary.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
For vfio-user, the server holds the pending IRQ state; set up an I/O
region for the MSI-X PBA so we can ask the server for this state on a
PBA read.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
The user container will shortly need access to the underlying vfio-user
proxy; set this up.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
Hook this call up to the legacy reset handler for vfio-user-pci.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
When the vfio-user container gets mapping updates, share them with the
vfio-user by sending a message; this can include the region fd, allowing
the server to directly mmap() the region as needed.

For performance, we only wait for the message responses when we're doing
with a series of updates via the listener_commit() callback.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: John Levon <[email protected]>
Unlike most other messages, this is a server->client message, for when a
server wants to do "DMA"; this is slow, so normally the server has
memory directly mapped instead.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
By default, the vfio-user subsystem will wait 5 seconds for a message
reply from the server. Add an option to allow this to be configurable.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
Support an asynchronous send of a vfio-user socket message (no wait for
a reply) when the write is posted. This is only safe when no regions are
mappable by the VM. Add an option to explicitly disable this as well.

Signed-off-by: John Levon <[email protected]>
Add new message to send multiple writes to server in a single message.
Prevents the outgoing queue from overflowing when a long latency
operation is followed by a series of posted writes.

Originally-by: John Johnson <[email protected]>
Signed-off-by: Elena Ufimtseva <[email protected]>
Signed-off-by: Jagannathan Raman <[email protected]>
Signed-off-by: John Levon <[email protected]>
Add some basic documentation on vfio-user usage.

Signed-off-by: John Levon <[email protected]>
This patch introduces the vfio-user protocol specification (formerly
known as VFIO-over-socket), which is designed to allow devices to be
emulated outside QEMU, in a separate process. vfio-user reuses the
existing VFIO defines, structs and concepts.

It has been earlier discussed as an RFC in:
"RFC: use VFIO over a UNIX domain socket to implement device offloading"

Signed-off-by: Thanos Makatos <[email protected]>
Signed-off-by: John Levon <[email protected]>
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Required At least one contributor does not have an approved Oracle Contributor Agreement. label Jun 25, 2025
@jlevon jlevon closed this Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Required At least one contributor does not have an approved Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.