[modular] i2i and t2i support for kontext modular #12454

sayakpaul · 2025-10-09T09:51:30Z

What does this PR do?

Subcedes #12269.

Test code:

import torch 
from diffusers import ModularPipeline
from diffusers.utils import load_image

repo_id = "black-forest-labs/FLUX.1-Kontext-dev"

pipe = ModularPipeline.from_pretrained(repo_id)
pipe.load_components(torch_dtype=torch.bfloat16)
pipe = pipe.to("cuda")

image = load_image(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/yarn-art-pikachu.png" 
).convert("RGB")
prompt = "Make Pikachu hold a sign that says 'Black Forest Labs is awesome', yarn art style, detailed, vibrant colors"

output = pipe(
    image=image,
    prompt=prompt,
    guidance_scale=2.5,
    num_inference_steps=28,
    max_sequence_length=512,
    generator=torch.manual_seed(0)
)
output.values["images"][0].save("modular_flux_kontext_image.png")

prompt = "A cat and a dog baking a cake together in a kitchen. The cat is carefully measuring flour, while the dog is stirring the batter with a wooden spoon. The kitchen is cozy, with sunlight streaming through the window."
output = pipe(
    prompt=prompt, 
    num_inference_steps=28, 
    guidance_scale=3.5, 
    generator=torch.manual_seed(0),
    max_sequence_length=512,
)
output.values["images"][0].save("modular_flux.png")

Results:

T2I	I2I

src/diffusers/modular_pipelines/flux/encoders.py

asomoza · 2025-10-09T10:57:58Z

in case you need it, the code I'm using to test:

import torch

from diffusers.modular_pipelines import ComponentsManager, ModularPipeline
from diffusers.utils import load_image

# CONFIG
repo_id = "black-forest-labs/FLUX.1-Kontext-dev"
device = "cuda"
prompt = "make it sknow"
guidance_scale = 2.5
num_inference_steps = 28
seed = 0
image = load_image("https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/differential/dog_source.png").convert("RGB")

# COMPONENTS MANAGER
components = ComponentsManager()
components.enable_auto_cpu_offload(device=device)

# BLOCKS
blocks =  ModularPipeline.from_pretrained(repo_id, components_manager=components).blocks

# ENCODE PROMPT
text_blocks = blocks.sub_blocks.pop("text_encoder")
text_encoder_node = text_blocks.init_pipeline(repo_id, components_manager=components)
text_encoder_node.load_components(torch_dtype=torch.bfloat16)

text_state = text_encoder_node(prompt=prompt, max_sequence_length=512)
text_embeddings = text_state.get_by_kwargs("denoiser_input_fields")

# ENCODE IMAGE
encoder_block_name = next((name for name in blocks.block_names if "encode" in name.lower() and "text" not in name.lower()), None)
encoder_blocks = blocks.sub_blocks.pop(encoder_block_name)
encoder_node = encoder_blocks.init_pipeline(repo_id, components_manager=components)
encoder_node.load_components(torch_dtype=torch.bfloat16)
state = encoder_node(image=image)
image_latents = state.get("image_latents")

# DENOISE
denoise_blocks = blocks.sub_blocks.pop("denoise")
denoise_node = denoise_blocks.init_pipeline(repo_id, components_manager=components)
denoise_node.load_components(torch_dtype=torch.bfloat16)

generator = torch.Generator(device=device).manual_seed(seed)

denoise_state = denoise_node(
    **text_embeddings,
    guidance_scale=guidance_scale,
    num_inference_steps=num_inference_steps,
    generator=generator,
    image_latents=image_latents,
)
latents = denoise_state.get("latents")

# VAE DECODE
decoder_blocks = blocks.sub_blocks.pop("decode")
decoder_node = decoder_blocks.init_pipeline(repo_id, components_manager=components)
decoder_node.load_components(torch_dtype=torch.bfloat16)
image = decoder_node(latents=latents, output="images")[0]
image.save("modular_result.png")

HuggingFaceDocBuilderDev · 2025-10-09T11:23:37Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul · 2025-10-10T05:04:54Z

@asomoza you should be good to go now :)

asomoza

LGTM! both T2I and I2I now works with nodes and I don't see any other issues

sayakpaul added 2 commits October 9, 2025 14:18

up

7d30c04

get ready

faf3bf6

sayakpaul requested a review from asomoza October 9, 2025 09:51

sayakpaul added the modular-diffusers label Oct 9, 2025

Merge branch 'main' into flux-kontext-modular-new

cd63afe

asomoza reviewed Oct 9, 2025

View reviewed changes

src/diffusers/modular_pipelines/flux/encoders.py Outdated Show resolved Hide resolved

fix import

75281f7

sayakpaul marked this pull request as draft October 9, 2025 13:11

sayakpaul added 2 commits October 9, 2025 18:42

up

30d67b5

up

b1ae489

sayakpaul marked this pull request as ready for review October 10, 2025 05:04

asomoza approved these changes Oct 10, 2025

View reviewed changes

sayakpaul merged commit 693d8a3 into main Oct 10, 2025
15 of 17 checks passed

sayakpaul deleted the flux-kontext-modular-new branch October 10, 2025 12:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[modular] i2i and t2i support for kontext modular #12454

[modular] i2i and t2i support for kontext modular #12454

Uh oh!

sayakpaul commented Oct 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

asomoza commented Oct 9, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 9, 2025

Uh oh!

sayakpaul commented Oct 10, 2025

Uh oh!

asomoza left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[modular] i2i and t2i support for kontext modular #12454

[modular] i2i and t2i support for kontext modular #12454

Uh oh!

Conversation

sayakpaul commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

Uh oh!

asomoza commented Oct 9, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 9, 2025

Uh oh!

sayakpaul commented Oct 10, 2025

Uh oh!

asomoza left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sayakpaul commented Oct 9, 2025 •

edited

Loading