[modular] add Modular flux for text-to-image #11995

sayakpaul · 2025-07-26T10:09:50Z

What does this PR do?

Plan to add the other tasks in a follow-up! I hope that's okay. Code to test this PR:

Unfold

import torch
from diffusers.modular_pipelines import SequentialPipelineBlocks
from diffusers.modular_pipelines.flux.modular_blocks import TEXT2IMAGE_BLOCKS
from diffusers.utils.logging import set_verbosity_debug

set_verbosity_debug()

model_id = "black-forest-labs/FLUX.1-dev"

blocks = SequentialPipelineBlocks.from_blocks_dict(TEXT2IMAGE_BLOCKS)

pipeline = blocks.init_pipeline()
pipeline.load_components(["text_encoder"], repo=model_id, subfolder="text_encoder", torch_dtype=torch.bfloat16)
pipeline.load_components(["tokenizer"], repo=model_id, subfolder="tokenizer")
pipeline.load_components(["text_encoder_2"], repo=model_id, subfolder="text_encoder_2", torch_dtype=torch.bfloat16)
pipeline.load_components(["tokenizer_2"], repo=model_id, subfolder="tokenizer_2")
pipeline.load_components(["scheduler"], repo=model_id, subfolder="scheduler")
pipeline.load_components(["transformer"], repo=model_id, subfolder="transformer", torch_dtype=torch.bfloat16)
pipeline.load_components(["vae"], repo=model_id, subfolder="vae", torch_dtype=torch.bfloat16)
pipeline.to("cuda")


prompt = "A cat and a dog baking a cake together in a kitchen. The cat is carefully measuring flour, while the dog is stirring the batter with a wooden spoon. The kitchen is cozy, with sunlight streaming through the window."
output = pipeline(
    prompt=prompt, num_inference_steps=28, guidance_scale=3.5, generator=torch.manual_seed(0)
)
output.get_intermediate("images")[0].save("modular_flux.png")

Output:

Also, I have decided to not implement any guidance in this PR as the original Flux pipeline doesn't have any guidance. LMK if that is okay.

sayakpaul · 2025-07-26T10:11:33Z

src/diffusers/pipelines/flux/pipeline_output.py

@@ -11,12 +11,14 @@
 @dataclass
 class FluxPipelineOutput(BaseOutput):
    """
-    Output class for Stable Diffusion pipelines.
+    Output class for Flux image generation pipelines.


Hope this change is okay.

sayakpaul · 2025-07-26T10:19:38Z

src/diffusers/modular_pipelines/flux/before_denoise.py

+    return mu
+
+
+def _pack_latents(latents, batch_size, num_channels_latents, height, width):


Didn't use "Copied from ..." here because:

make fix-copies enforces a weird indentation for this, which is errored out by the repo consistency check.

So, say you have the following as a standalone function in a module:

# Copied from diffusers.pipelines.flux.pipeline_flux.FluxPipeline._pack_latents def _pack_latents(latents, batch_size, num_channels_latents, height, width): latents = latents.view(batch_size, num_channels_latents, height // 2, 2, width // 2, 2) latents = latents.permute(0, 2, 4, 1, 3, 5) latents = latents.reshape(batch_size, (height // 2) * (width // 2), num_channels_latents * 4) return latents

The moment you run make fix-copies after this, you will have the following diff:

+# Copied from diffusers.pipelines.flux.pipeline_flux.FluxPipeline._pack_latents def _pack_latents(latents, batch_size, num_channels_latents, height, width): - latents = latents.view(batch_size, num_channels_latents, height // 2, 2, width // 2, 2) + latents = latents.view(batch_size, num_channels_latents, height // 2, 2, width // 2, 2) + latents = latents.permute(0, 2, 4, 1, 3, 5) + latents = latents.reshape(batch_size, (height // 2) * (width // 2), num_channels_latents * 4) + + return latents latents = latents.permute(0, 2, 4, 1, 3, 5) latents = latents.reshape(batch_size, (height // 2) * (width // 2), num_channels_latents * 4)

One can notice the messed up indentation. We should fix in a separate PR. Cc: @DN6

nice actually
I think we should move a lot more methods away from pipeline and as functions
# Copied from does not work well for poeple that's not maintainers; with modular system, all the methods are refactored to not depends on state anyway

Indeed. Could be cool to consider in the set of refactors @DN6 is doing 👀

HuggingFaceDocBuilderDev · 2025-07-26T10:23:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yiyixuxu

thanks @sayakpaul
can you manually create a modular repo for flux too? (see #11913 (comment))

yiyixuxu · 2025-07-28T16:54:31Z

src/diffusers/modular_pipelines/flux/encoders.py

+                raise ValueError(f"`prompt` or `prompt_2` has to be of type `str` or `list` but is {type(prompt)}")
+
+    @staticmethod
+    def _get_t5_prompt_embeds(


I think we can turn these two methods to functions and use across different models: flux/ltx/sd3 ....
I will put up a prototype in one of my PRs, just FYI here

Indeed. Would be very curious to learn more.

sayakpaul · 2025-07-29T02:54:10Z

@yiyixuxu here is the repo: https://huggingface.co/diffusers-internal-dev/modular-flux.1-dev/.

Do we have to manually populate repo like seen in Wan?

I will merge this PR once the above point is clarified.

yiyixuxu · 2025-07-29T16:15:38Z

@sayakpaul

yes, manually
but we will make it work with standard repo directly in #11944

Do we have to manually populate repo

sayakpaul · 2025-07-29T16:36:18Z

Alright. I manually populated the repo. Looking forward to that PR. I will open PRs for the other tasks for Flux. This is getting very infectious to work on ❤️

sayakpaul · 2025-07-29T16:36:35Z

Failing tests are unrelated.

sayakpaul added 6 commits July 24, 2025 21:11

start flux.

cd71035

more

45465d4

up

9381dd6

up

0636e9d

up

0484e77

up

0496a69

sayakpaul requested a review from yiyixuxu July 26, 2025 10:09

sayakpaul added the modular-diffusers label Jul 26, 2025

Merge branch 'main' into modular-flux

22e8cb4

sayakpaul commented Jul 26, 2025

View reviewed changes

sayakpaul added 2 commits July 26, 2025 15:42

get back the deleted files.

3c278c0

up

ac89477

sayakpaul commented Jul 26, 2025

View reviewed changes

yiyixuxu approved these changes Jul 28, 2025

View reviewed changes

sayakpaul added 2 commits July 29, 2025 07:40

Merge branch 'main' into modular-flux

c66c8a9

Merge branch 'main' into modular-flux

68a38e7

sayakpaul added 2 commits July 29, 2025 08:25

empathy

0a310da

Merge branch 'main' into modular-flux

25ee1d8

sayakpaul merged commit 203dc52 into main Jul 29, 2025
13 of 15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[modular] add Modular flux for text-to-image #11995

[modular] add Modular flux for text-to-image #11995

sayakpaul commented Jul 26, 2025 •

edited

Loading

Uh oh!

sayakpaul Jul 26, 2025

Uh oh!

sayakpaul Jul 26, 2025

Uh oh!

yiyixuxu Jul 28, 2025

Uh oh!

sayakpaul Jul 29, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jul 26, 2025

Uh oh!

yiyixuxu left a comment

Uh oh!

yiyixuxu Jul 28, 2025

Uh oh!

sayakpaul Jul 29, 2025

Uh oh!

sayakpaul commented Jul 29, 2025

Uh oh!

yiyixuxu commented Jul 29, 2025

Uh oh!

sayakpaul commented Jul 29, 2025

Uh oh!

sayakpaul commented Jul 29, 2025

Uh oh!

Uh oh!

Uh oh!

		return mu


		def _pack_latents(latents, batch_size, num_channels_latents, height, width):

[modular] add Modular flux for text-to-image #11995

[modular] add Modular flux for text-to-image #11995

Conversation

sayakpaul commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

sayakpaul Jul 26, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Jul 26, 2025

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jul 26, 2025

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Jul 29, 2025

Uh oh!

yiyixuxu commented Jul 29, 2025

Uh oh!

sayakpaul commented Jul 29, 2025

Uh oh!

sayakpaul commented Jul 29, 2025

Uh oh!

Uh oh!

Uh oh!

sayakpaul commented Jul 26, 2025 •

edited

Loading