ComfyUI Dynamic VRAM compatibility#572
Conversation
|
I had a quick test and it to errors with: [ERROR] Error during processing: module 'comfy' has no attribute 'ops' |
|
Is it failing at this line specifically? src/common/config.py, line 219 --> default_ops = comfy.ops.manual_cast I am on comfy stable release 0.18.1 |
|
Yes, here are more log entries: I have upgraded Comfyui |
|
Just to clarify, are you using the standalone CLI? And if that is true, are you able to test the node itself within ComfyUI? |
|
Yes, I was using the CLI. I have also tested the node within confyui, and Phase 2 DIT upscaling gives an OOM error (Error in Phase 2 (Upscaling): Allocation on device) without block swap, despite lots of RAM being available. |
|
Okay then I'll mark this as draft while I look at these issues. |
|
Just tested this MR with the latest Comfy using 7b Q4 on a 5090 batch size 41 on a 2.7MP video, no crash at all. Upscaling processed successfully. I tested with both EDIT: Btw this is through the UI not CLI |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8a2308f770
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| debug.log("Using CPU intermediate loading for MPS compatibility", | ||
| category="info", indent_level=1) | ||
| state = load_safetensors_file(checkpoint_path, device="cpu") | ||
| state = comfy.utils.load_torch_file(checkpoint_path, device=device_str) |
There was a problem hiding this comment.
Load MPS fallback tensors on CPU
In the MPS allocator-error fallback path, the retry still calls comfy.utils.load_torch_file(..., device=device_str), where device_str is still "mps". That re-runs the same failing load path instead of doing the intended CPU intermediate load, so users on MPS who hit allocation/watermark errors will continue to fail rather than recovering.
Useful? React with 👍 / 👎.
|
|
||
| # 2b. Handle InflatedCausalConv3d | ||
| elif isinstance (child, InflatedCausalConv3d) and hasattr(target_ops, "Conv3d"): | ||
| child.__bases__ = (target_ops.Conv3d,) |
There was a problem hiding this comment.
Convert InflatedCausalConv3d by replacing module instance
Setting child.__bases__ = (target_ops.Conv3d,) mutates an instance attribute, not the class inheritance used for dispatch, so this does not actually convert InflatedCausalConv3d layers to Comfy ops. As a result, models that rely on InflatedCausalConv3d keep the original layer type and miss the dynamic-VRAM operation patching this commit intends to add.
Useful? React with 👍 / 👎.
To clarify, with the base repository (without this pull request) were you able to run the model without crashes at both blocks to swap = 0 and blocks to swap = 36? |
On the official main branch yes I can run blocks to swap = 36 no problems as well. I haven't tested 0 in a long time so I'm not sure. |
|
Let me know if there's anything you want me to test |
In theory, I could use some testing with block swap turned completely off. That said, on my device which is less powerful, disabling block swap while using my pull request currently still results in OOM so I don't consider this PR up to par yet. |
|
Just realised that So I pulled the latest commit and tested it again. Still works with batch size of 41. When I use batch size 81, the VRAM goes to 100%, but I guess at that point it's all taken up by the latents not the layers, so it won't be a good test for offloading. |
Patches the model with ComfyUI's operations layers (i.e. replacing nn.Linear with operations.Linear) before loading the weights. Also uses ComfyUI's safetensors loader. In theory this should be all that is necessary to implement ComfyUI's recent Dynamic VRAM features.