-
Notifications
You must be signed in to change notification settings - Fork 6k
Chroma Pipeline #11698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chroma Pipeline #11698
Conversation
Amazing work on this @Ednaordinary, huge thanks. Getting the below when adding a LoRA: scale_expansion_fn = _SET_ADAPTER_SCALE_FN_MAPPING[self.__class__.__name__]
E KeyError: 'ChromaTransformer2DModel' Can we add _SET_ADAPTER_SCALE_FN_MAPPING = {
...
"ChromaTransformer2DModel": lambda model_cls, weights: weights,
...
} I think that will fix. |
This reverts commit 3fe4ad6.
Great work @Ednaordinary @hameerabbasi and @iddl! 🚀 |
Awesome. Thank you all. 👍 |
Thank you for this great commit! A question: is bitsandbytes quantization not supported for this model? (As it, it's 18.5 GB VRAM, it's a bit too heavy for a lot of consumer cards) import torch
from diffusers import ChromaTransformer2DModel, ChromaPipeline, BitsAndBytesConfig
from transformers import T5EncoderModel, T5Tokenizer
bfl_repo = "black-forest-labs/FLUX.1-dev"
dtype = torch.bfloat16
nf4_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
transformer = ChromaTransformer2DModel.from_single_file("https://huggingface.co/lodestones/Chroma/blob/main/chroma-unlocked-v35.safetensors", torch_dtype=dtype, quantization_config=nf4_config)
text_encoder = T5EncoderModel.from_pretrained(bfl_repo, subfolder="text_encoder_2", torch_dtype=dtype)
tokenizer = T5Tokenizer.from_pretrained(bfl_repo, subfolder="tokenizer_2", torch_dtype=dtype)
pipe = ChromaPipeline.from_pretrained(bfl_repo, transformer=transformer, text_encoder=text_encoder, tokenizer=tokenizer, torch_dtype=dtype)
pipe.enable_model_cpu_offload() Gives me this error:
|
@tin2tin bitsandbytes is supported! Just save the diffusers version first with Actually my weights still load fine, just prints some unnecessary attribute warnings. Will fix when I get around to it |
@Ednaordinary Oh, that's super cool! How do you load your diffusers version? Just the transformer and quantize that? Can you share a snippet which shows how you use it? Like this?
Getting this notice:
|
You can safely ignore the config notice, it's because changes have been made to the diffusers code since I generated that checkpoint. Also be sure to add |
@Ednaordinary Where should I add |
See here is the code for different model.
Also instead of applying quantization on text_encoder_2 and transformer separately, both these modules can be specified in quantization_config. If I remember correctly @sayakpaul posted example somewhere. I somehow can't find it. It was something like pipe=FluxPipeline (
quantization_config=BitsAndBytesConfig(
load_in_4bit=True,
llm_int8_skip_modules=["lm_head"],
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
??=[text_encoder_2, transformer]
)
) |
Ok I found it components_to_quantize=["transformer", "text_encoder_2"] |
Another pipeline level quant example is here #11698 (comment) Also yes, the parameter is passed to the BitsAndBytes config BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_quant_type="nf4",
llm_int8_skip_modules=["distilled_guidance_layer"],
) |
Ahh this is what I was referring to (sample from Sayakpaul). Thanks. |
if you want a ready to use code, this one works with main branch: import torch
from diffusers import ChromaPipeline
from diffusers.quantizers import PipelineQuantizationConfig
dtype = torch.bfloat16
repo_id = "imnotednamode/Chroma-v36-dc-diffusers"
pipeline_quant_config = PipelineQuantizationConfig(
quant_backend="bitsandbytes_4bit",
quant_kwargs={
"load_in_4bit": True,
"bnb_4bit_quant_type": "nf4",
"bnb_4bit_compute_dtype": dtype,
"llm_int8_skip_modules": ["distilled_guidance_layer"],
},
components_to_quantize=["transformer", "text_encoder"],
)
pipe = ChromaPipeline.from_pretrained(
"imnotednamode/Chroma-v36-dc-diffusers",
quantization_config=pipeline_quant_config,
torch_dtype=dtype,
)
pipe.enable_model_cpu_offload()
prompt = 'Ultra-realistic, high-quality photo of an anthropomorphic capybara with a tough, streetwise attitude, wearing a worn black leather jacket, dark sunglasses, and ripped jeans. The capybara is leaning casually against a gritty urban wall covered in vibrant graffiti. Behind it, in bold, dripping yellow spray paint, the word "HuggingFace" is scrawled in large street-art style letters. The scene is set in a dimly lit alleyway with moody lighting, scattered trash, and an edgy, rebellious vibe — like a character straight out of an underground comic book.'
negative = "low quality, bad anatomy, extra digits, missing digits, extra limbs, missing limbs"
image = pipe(
prompt=prompt,
negative_prompt=negative,
num_inference_steps=30,
guidance_scale=4.0,
width=1024,
height=1024,
generator=torch.Generator().manual_seed(42),
).images[0]
image.save("chroma.png") |
What does this PR do?
Fixes #11010
relevant #11566
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@DN6