virtio-queue: Handle address translations #75

sboeuf · 2021-08-18T16:30:43Z

For devices with access to data in memory being translated, we add to
the Queue the ability to translate the address stored in the descriptor.

It is very helpful as it performs the translation right after the
untranslated address is read from memory, avoiding any errors from
happening from the consumer's crate perspective. It also allows the
consumer to reduce greatly the amount of duplicated code for applying
the translation in many different places.

Signed-off-by: Sebastien Boeuf [email protected]

sboeuf · 2021-08-30T10:44:01Z

PTAL @jiangliu @slp @andreeaflorescu

For devices with access to data in memory being translated, we add to the Queue the ability to translate the address stored in the descriptor. It is very helpful as it performs the translation right after the untranslated address is read from memory, avoiding any errors from happening from the consumer's crate perspective. It also allows the consumer to reduce greatly the amount of duplicated code for applying the translation in many different places. Signed-off-by: Sebastien Boeuf <[email protected]>

slp

LGTM. Perhaps it'd be nice to implement a test that uses a mock-up AccessPlatform and verifies the translation of some descriptor addresses?

alexandruag · 2021-08-31T10:12:04Z

Hi, sorry was away for a while. I'll do a review and return with any comments asap.

alexandruag · 2021-09-03T07:02:08Z

Hi again, just had a better look and this is an interesting use case. From the perspective of something that always requires this, there is def an ease of use advantage with placing the functionality here, but at the same time it's otherwise completely unrelated and orthogonal to virtio queue handling and descriptor chain parsing, so it would be great to find another solution and keep concerns separated.

A better understanding of the context would also help a lot. Can you please give some more details about what generates the use case? Is the translation specific to certain devices, or memory backends, or is it required by something completely different? Depending on what the requirements are we can maybe find alternative solutions that can be plugged into devices, or a GuestMemory thin wrapper that transparently performs translations, etc.

sboeuf · 2021-09-13T08:50:08Z

From the perspective of something that always requires this, there is def an ease of use advantage with placing the functionality here, but at the same time it's otherwise completely unrelated and orthogonal to virtio queue handling and descriptor chain parsing, so it would be great to find another solution and keep concerns separated.

Well since I'm using an Option, even when it's not used, it's not adding any significant overhead to the queue handling.

A better understanding of the context would also help a lot. Can you please give some more details about what generates the use case? Is the translation specific to certain devices, or memory backends, or is it required by something completely different? Depending on what the requirements are we can maybe find alternative solutions that can be plugged into devices, or a GuestMemory thin wrapper that transparently performs translations, etc.

This is required to support virtio-iommu for virtio devices. When using virtio-iommu, you expose a vIOMMU to the guest, and the guest can see a bunch of virtio devices attached to it. This means the addresses passed through the virtqueues are IOVAs and not GPAs. That's where the need for translation comes from.

I'm open to solutions that you think would make more sense but it's very important to keep this hidden from the virtio devices themselves, otherwise we would end up having to copy/paste the same translation code for each virtio device.
A GuestMemory wrapper might be interesting, could you elaborate how we could achieve this?
This missing translation feature is the main blocker preventing Cloud-Hypervisor from moving to the upstream version of the virtio queues :)

alexandruag · 2021-09-13T09:09:25Z

Thanks for the explanation, that's some really useful context to have! I've temporarily switched focus to #76, and was just thinking whether we can leverage some of the ideas introduced there to help with meeting this use case as well. I also want to have a quick look at virtio-iommu just to consolidate the big picture a bit, and will get back to this PR soon.

jiangliu · 2021-09-14T02:58:43Z

Seems we only handle IOVA in virtio descriptors, so other guest DMA address such as descriptor table address, available ring address and used ring address have already been handled by the queue caller?

sboeuf · 2021-09-14T07:32:30Z

Seems we only handle IOVA in virtio descriptors, so other guest DMA address such as descriptor table address, available ring address and used ring address have already been handled by the queue caller?

Yes we don't need to perform the translation of the descriptor table, available ring and used ring addresses every time. Instead I chose the approach of having the VMM doing the translation only when the queue is enabled by the guest. But we can discuss this as well.

jiangliu · 2021-09-14T09:12:03Z

I'm a little concern about current design. Say guest is doing a big block device IO, mapping several discontinuous GPA into a continuous IOVA. Then only one virtio descriptor will be allocated for the block IO. How will current design handle this case?

jiangliu · 2021-09-14T09:34:09Z

I'm a little concern about current design. Say guest is doing a big block device IO, mapping several discontinuous GPA into a continuous IOVA. Then only one virtio descriptor will be allocated for the block IO. How will current design handle this case?

In other words, when virito-viommu is involved, a continuous IOVA (for a virtio descriptor) may be discontinues in GPA.

sboeuf · 2021-09-14T09:54:27Z

I'm a little concern about current design. Say guest is doing a big block device IO, mapping several discontinuous GPA into a continuous IOVA. Then only one virtio descriptor will be allocated for the block IO. How will current design handle this case?

Yes you're right that seems like a case we should also handle with care. I think having a wrapper in vm-memory to handle the translation might be the best option as suggested by @alexandruag
Do you think that would solve this issue?

jiangliu · 2021-09-14T09:56:47Z

I'm a little concern about current design. Say guest is doing a big block device IO, mapping several discontinuous GPA into a continuous IOVA. Then only one virtio descriptor will be allocated for the block IO. How will current design handle this case?

Yes you're right that seems like a case we should also handle with care. I think having a wrapper in vm-memory to handle the translation might be the best option as suggested by @alexandruag
Do you think that would solve this issue?

I have the same feeling. Maybe extending vm-memory to access guest memory by both GVA and HVA is more robust design. But then we need to really careful about performance. We have observed performance penalty caused by vm-memory.

In theory, device driver should only access guest memory with IOVA, so extending vm-memory to support IOVA is natural.

alexandruag · 2021-09-28T07:40:47Z

Hi again, I'm currently experimenting with the wrapper option, but still have to iron out a couple of things which are a bit harder to do than initially expected. Meanwhile, I was curious if there are other requirements for this use case as well. In particular, can you please give some more details about the implications of the "discontinuous in GPA, but continuous IOVA" scenario the Gerry mentions above, and what is (at least on a conceptual level) the proper way to handle it?

sboeuf · 2021-09-28T09:38:20Z

Meanwhile, I was curious if there are other requirements for this use case as well. In particular, can you please give some more details about the implications of the "discontinuous in GPA, but continuous IOVA" scenario the Gerry mentions above, and what is (at least on a conceptual level) the proper way to handle it?

Well I think it's mainly about handling the case where it might span over multiple memory regions, which is already handled internally by vm-memory.

alexandruag · 2021-10-04T13:46:52Z

Thanks! Just a quick update: the GuestMemory wrapper approach did not turn out super feasible, mostly because the GuestMemory interface has quite a lot of features, and when taking into account things like iterating over the regions, stuff quickly becomes quite complicated so I'm no longer very optimistic about this alternative right now.

I'm currently looking at two others. The first (which is not necessarily also the most appealing) relies on the observation that many queue related operations right now do not actually make use of the full GuestMemory interface, but only rely on a couple of operations, all of them from Bytes. If we restrict the trait bound to that (or something even simpler) it becomes easier to implement indirection using an adapter.

The second option is to extend the definition of generic interfaces started by Gerry's recent PR to descriptor chains as well, at which point it once again becomes simpler to have plugable functionality. Looks like the more difficult part in this case is what's the best way to dispatch operations to a particular implementation, and that's something I'm giving thought to ATM. Will be back soon with more details.

sboeuf · 2022-01-24T10:19:16Z

Thanks! Just a quick update: the GuestMemory wrapper approach did not turn out super feasible, mostly because the GuestMemory interface has quite a lot of features, and when taking into account things like iterating over the regions, stuff quickly becomes quite complicated so I'm no longer very optimistic about this alternative right now.

I'm currently looking at two others. The first (which is not necessarily also the most appealing) relies on the observation that many queue related operations right now do not actually make use of the full GuestMemory interface, but only rely on a couple of operations, all of them from Bytes. If we restrict the trait bound to that (or something even simpler) it becomes easier to implement indirection using an adapter.

The second option is to extend the definition of generic interfaces started by Gerry's recent PR to descriptor chains as well, at which point it once again becomes simpler to have plugable functionality. Looks like the more difficult part in this case is what's the best way to dispatch operations to a particular implementation, and that's something I'm giving thought to ATM. Will be back soon with more details.

@alexandruag any progress? I'd like to be able to move Cloud Hypervisor to the upstream version of vm-virtio/virtio-queue and I can't do so until we find a solution regarding this address translation.

sboeuf · 2022-01-28T10:39:44Z

Since it's very complicated to find a consensus on this matter, I've implemented the support for this thing directly in Cloud Hypervisor cloud-hypervisor/cloud-hypervisor#3641

sboeuf requested review from alexandruag, jiangliu and slp as code owners August 18, 2021 16:30

sboeuf force-pushed the add_addr_translation branch from bcb2e4b to 98f5e49 Compare August 18, 2021 16:38

sboeuf changed the title ~~virtio-queue: Add trait for handling address translation~~ virtio-queue: Handle address translations Aug 18, 2021

sboeuf requested a review from andreeaflorescu August 19, 2021 07:42

sboeuf force-pushed the add_addr_translation branch from 98f5e49 to 537dcb2 Compare August 30, 2021 10:44

slp reviewed Aug 31, 2021

View reviewed changes

sboeuf closed this Jan 28, 2022

virtio-queue: Handle address translations #75

virtio-queue: Handle address translations #75

Uh oh!

Conversation

sboeuf commented Aug 18, 2021

Uh oh!

sboeuf commented Aug 30, 2021

Uh oh!

slp left a comment

Choose a reason for hiding this comment

Uh oh!

alexandruag commented Aug 31, 2021

Uh oh!

alexandruag commented Sep 3, 2021

Uh oh!

sboeuf commented Sep 13, 2021

Uh oh!

alexandruag commented Sep 13, 2021

Uh oh!

jiangliu commented Sep 14, 2021

Uh oh!

sboeuf commented Sep 14, 2021

Uh oh!

jiangliu commented Sep 14, 2021

Uh oh!

jiangliu commented Sep 14, 2021

Uh oh!

sboeuf commented Sep 14, 2021

Uh oh!

jiangliu commented Sep 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexandruag commented Sep 28, 2021

Uh oh!

sboeuf commented Sep 28, 2021

Uh oh!

alexandruag commented Oct 4, 2021

Uh oh!

sboeuf commented Jan 24, 2022

Uh oh!

sboeuf commented Jan 28, 2022

Uh oh!

Uh oh!

jiangliu commented Sep 14, 2021 •

edited

Loading