retune lowVramPatch VRAM accounting #11173

rattus128 · 2025-12-07T12:59:57Z

In the lowvram case, this now does its math in the model dtype post de-quantization. Account for that. The patching was also put back on the compute stream getting it off-peak so relax the MATH_FACTOR to only x2 so get out of the worst-case assumption of everything peaking at once.

RTX3060, flux2 fp8 with Lora:

Before:

After:

In the lowvram case, this now does its math in the model dtype in the post de-quantization domain. Account for that. The patching was also put back on the compute stream getting it off-peak so relax the MATH_FACTOR to only x2 so get out of the worst-case assumption of everything peaking at once.

rattus128 requested review from Kosinkadink, comfyanonymous and guill as code owners December 7, 2025 12:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

retune lowVramPatch VRAM accounting #11173

retune lowVramPatch VRAM accounting #11173

rattus128 commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

retune lowVramPatch VRAM accounting #11173

Are you sure you want to change the base?

retune lowVramPatch VRAM accounting #11173

Conversation

rattus128 commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant