-
Notifications
You must be signed in to change notification settings - Fork 80
Build AMD MxGPU and package them #200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I can't find the source RPM so I'll have to try to build things manually from that github repo and see if it matches the contents of the RPM. |
The RPM is not in Citrix ISO but on AMD website, in the driver page, you'll have a "XenServer driver download". You can find the RPMs in there. |
I had found it but there is no source RPM, only the binary one. |
Indeed, no SRPM. Maybe there's one for KVM. |
Was able to manually build the Github repo on XCP-ng with current kernel in v8 and load the gim module. I get the proper lspci output with the card showing all the virtual interfaces on PCI. But the control interfaces don't seem to know how to use it? I can't attach the GPUs to the VM in XOA or in Center, it just tells me I can passthrough the entire card but not the virtual adapters (Virtual GPU type in Center is just "Passthrough whole GPU"). |
@armouredking sorry for late answer. I haven't had the opportunity to test myself yet so I don't know what to expect. |
Bookmark for myself: latest officially built MxGPU GIM (for XS >= 7.4 and < 8 ): https://www.amd.com/en/support/kb/release-notes/rn-pro-mxgpu-gim-1-05 |
@armouredking did you follow the deployment guide from the page linked above? |
The mxgpu RPM that AMD distributed for XS 7.4+ contains more than what the github repository has:
Two configuration files, a tool named I don't know what In https://github.com/GPUOpen-LibrariesAndSDKs/MxGPU-Virtualization master builds fine with our kernel in XCP-ng 8.0, but this pull request may improve the compatibility. I'm not able to evaluate it and haven't the hardware to test. Edit: actually the pull request seems not necessary anymore. The code since this commit takes kernel >= 4.14 into account. There's an open pull request for kernel >= 5 though, but we aren't there yet. |
Confirmed. Can compile the github drivers on XCP-NG 8.0. Modprobe seems happy but can only pass through the GPU on an S7150x2. Had to yum install gcc git and kernel-devel to allow the gim.sh script to run. |
Compared lspci output from 8.0 and 7.6. 7.6 shows a lot more GPU devices versus only 2 "Tonga XT GL" GPUs on 8.0. Dunno if that helps. |
Made some progress in 8.0. I now have the GUI showing 16 slices per GPU and they can be assigned to a VM. Upon starting the VM if fails with Failure "Call to gimtool failed". Gimtool is a python script so perhaps that wouldn't be too bad to modify if i knew what was wrong or how it is supposed to work. /var/log/xensource shows the gimtool command that was called. Entering that command into the command line yields a bit more descriptive error:
What i did to get where I am now was: (paraphrased) Install XCP-NG 8.0 from disc. edit /etc/gim_config
edit (add) /etc/mxgpu-whitelist (this changes the GPU GUI thing from passthrough to shareable for lack of a better explanation)
add /etc/modules-load.d/gim.conf (load gim at boot)
Copy gimtool from working 7.6 server to /opt/xensource/bin/ |
Thanks, keep us posted! Next I'd try to understand where that "Inappropriate ioctl for device" comes from (related to the RECONFIGURE_PF ioctl if I understand correctly). |
Hello! Any progress on this? Did you try to patch the kernel as in ? (Line 8). |
I don't believe I ran that patch. It is all getting a bit beyond my skills. I also had to put the test box into production as a VMware box for now due to the pandemic. Hopefully after things get close to normal I can return to this. |
OK thanks! Please do update if you happen to try it again! |
What would be a good way to DM someone regarding this? |
I'm not a developer by any means, so if there's someone who can work with me on the nuts and bolts, I have a fresh 8.1 install with a Radeon Pro W5500 that can be used for getting mxgpu working. |
AMD has released a new iso with xenserver 8.1 compatibility found on a following link: https://www.amd.com/en/support/professional-graphics/firepro/firepro-s-series/firepro-s7150-x2 Just testing |
Eny hint on how to deal with gim 0000:07:00.0: not enough MMIO resources for SR-IOV dmesg | grep gim[ 9.176818] gim_api: loading out-of-tree module taints kernel. |
https://support.citrix.com/article/CTX250121 ? |
Modifying BIOS: (This is a WS460 Gen 8 BLADE with Expansion) SR_IOV was enabled had to enable Additional params. lspci | grep VGA |
The card is visible shareable, configurable, but cannot configure it with drivers, the drivers fail to load, trying to get rid of: ` |
Posting some intermediate results and error that I see.
XCP-NG 8.1 with https://github.com/GPUOpen-LibrariesAndSDKs/MxGPU-Virtualization
|
I can confirm that a fresh installation of xcp-ng 8.1 and AMD iso from |
@dynodix can you share Regarding having to load |
here my lspci -k
|
Another hint, I had first setup the fresh instalation of XCP-NG 8.1 , then installed AMD ISO 2.0 as standalone machine (not into the pool) , updated with yum update , and after the update I had added the machine to my pool. The |
Windows 10 with MxGPU drivers now behave stable and normally, now I m trying to use the GPU for transcoding purposes under ubuntu 18 but have problems initializing the card driver and obtain valid info with |
AFAIK: All individual changes should happen after pool integration as it might result into problems, if you do it before. |
Radeon Pro W5500 Result with AMD offered driver gim.ko
Result with modified open source gim module having additional support for PCI ID 1002:7341 RadeonPro is below. By default
Do you guys know if Radeon Pro is supported? |
Hey @rushikeshjadhav the Radeon Pro W5500 I believe does not support MXGPU. The current list as I understand it is below. Radeon Instinct MI6 |
Some of them are somehow affordable to put in our lab. /me thinking on getting some extra hardware. |
@ethanjosephscott Thanks for providing supported cards list. |
According to: https://en.wikipedia.org/wiki/Video_Coding_Engine S7150 x2 should support VCE3.0 with h264 encoding decoding... But how? ... |
Yes. By specification it has to. |
I found there the problem was. It's gim driver of MxGPU. It doesn't support video processing.
After that GPU starts working as needed. |
According to a recent post in the AMD community forums: "When you say MxGPU works, I assume you mean you have enabled MxGPU. If so, then you will not get VCE on each VM. VCE on FirePro S7150 X2 works only if you use the entire GPU." |
AFAIK Technically it can't work: You have less en-/decoders than partitions per GPU and they probably can't just split that parts (and I doubt there is a big demand worth the effort). |
Sorry for reviving an 8 month old thread, but may I ask if there is any official documentation that says that the MI100 supports MxGPU? |
Hey I made that list before the MI100 was officially announced, not sure how I knew about it? But I believe I was going under the impression all of the instinct cards would have mxgpu support. It looks like the radeon pro v520 is the newest mxgpu card (not sold to the public). |
I modified open source GIM and it works with a instinct mi6 ` ` |
The Radeon Pro ist, as the name says, based on Radeon chips. It's a whole different chip than the Instinct/MI ones. |
its a instinct mi6, believe me. "lspci" command shows "radeon pro" but its a instinct mi6 |
You're right, the old/first generations of Instinct were using the radeon chips (I just looked it up). |
https://github.com/GPUOpen-LibrariesAndSDKs/MxGPU-Virtualization
AMD already build some RPMs for XS, and the package is free, but it seems it's only built for 4.4 kernel (see mxgpu-4.4.0+10-modules-1.0.5.amd-1.x86_64.rpm)
So we should be able to package that ourselves.
The text was updated successfully, but these errors were encountered: