Skip to content

Commit 3396143

Browse files
author
davidb
committed
update doc
1 parent 5f6359f commit 3396143

File tree

1 file changed

+13
-13
lines changed

1 file changed

+13
-13
lines changed

docs/source/en/api/pipelines/photon.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ from diffusers.pipelines.photon import PhotonPipeline
5656
pipe = PhotonPipeline.from_pretrained("Photoroom/photon-512-t2i")
5757
pipe.to("cuda")
5858

59-
prompt = "A vast night sky over a quiet city suddenly blazes with enormous glowing neon letters spelling “PHOTON.” The word hums and flickers dramatically, as if trying a little too hard to look epic. The soft glow bathes the rooftops and streets below in blue and pink light. A few people look up, squinting, some taking selfies; a cat blinks lazily at the sky’s new centerpiece. The air feels cinematic and electric — like a sci-fi movie that doesn’t take itself too seriously. Mist swirls around the neon glow, adding a dreamy, aesthetic touch to the humor of it all."
59+
prompt = prompt = "A digital painting or a heavily manipulated photograph, appearing as a surreal portrait of a young woman. The composition is a close-up, focusing on the face. The woman's face is partially obscured by fragmented, cracked, light teal and off-white pieces resembling peeling paint or decaying skin. These fragments are irregularly shaped and layered, creating a sense of depth and texture. The woman's skin is subtly illuminated, with a warm, golden light highlighting her features, particularly her lips and eyes. Her eyes are a striking light blue, contrasting with the cool tones of the fragmented elements. The overall color palette is muted, with teal, beige, and golden hues dominating. The atmosphere is melancholic and mysterious, with a hint of ethereal beauty. The style is surreal and painterly, blending realistic portraiture with abstract elements. The vibe is introspective and unsettling, suggesting themes of vulnerability, fragility, and hidden identity. The lighting is dramatic, with a chiaroscuro effect emphasizing the texture and form of the fragmented elements"
6060
image = pipe(prompt, num_inference_steps=28, guidance_scale=4.0).images[0]
6161
image.save("photon_output.png")
6262
```
@@ -85,12 +85,12 @@ scheduler = FlowMatchEulerDiscreteScheduler.from_pretrained(
8585

8686
# Load T5Gemma text encoder
8787
t5gemma_model = T5GemmaModel.from_pretrained("google/t5gemma-2b-2b-ul2")
88-
text_encoder = t5gemma_model.encoder
88+
text_encoder = t5gemma_model.encoder.to(dtype=torch.bfloat16)
8989
tokenizer = GemmaTokenizerFast.from_pretrained("google/t5gemma-2b-2b-ul2")
9090
tokenizer.model_max_length = 256
9191
# Load VAE - choose either Flux VAE or DC-AE
9292
# Flux VAE (16 latent channels):
93-
vae = AutoencoderKL.from_pretrained("black-forest-labs/FLUX.1-dev", subfolder="vae")
93+
vae = AutoencoderKL.from_pretrained("black-forest-labs/FLUX.1-dev", subfolder="vae").to(dtype=torch.bfloat16)
9494
# Or DC-AE (32 latent channels):
9595
# vae = AutoencoderDC.from_pretrained("mit-han-lab/dc-ae-f32c32-sana-1.0-diffusers")
9696

@@ -134,15 +134,15 @@ Key parameters for image generation:
134134
# Example with custom parameters
135135
import torch
136136
from diffusers.pipelines.photon import PhotonPipeline
137-
with torch.autocast("cuda", dtype=torch.bfloat16):
138-
pipe = pipe(
139-
prompt="A highly detailed 3D animated scene of a cute, intelligent duck scientist in a futuristic laboratory. The duck stands on a shiny metallic floor surrounded by glowing glass tubes filled with colorful liquids—blue, green, and purple—connected by translucent hoses emitting soft light. The duck wears a tiny white lab coat, safety goggles, and has a curious, determined expression while conducting an experiment. Sparks of energy and soft particle effects fill the air as scientific instruments hum with power. In the background, holographic screens display molecular diagrams and equations. Above the duck’s head, the word “PHOTON” glows vividly in midair as if made of pure light, illuminating the scene with a warm golden glow. The lighting is cinematic, with rich reflections and subtle depth of field, emphasizing a Pixar-like, ultra-polished 3D animation style. Rendered in ultra high resolution, realistic subsurface scattering on the duck’s feathers, and vibrant color grading that gives a sense of wonder and scientific discovery.",
140-
num_inference_steps=28,
141-
guidance_scale=4.0,
142-
height=512,
143-
width=512,
144-
generator=torch.Generator("cuda").manual_seed(42)
145-
).images[0]
137+
pipe = PhotonPipeline.from_pretrained("Photoroom/photon-512-t2i", torch_dtype=torch.bfloat16)
138+
pipe = pipe(
139+
prompt = "A digital painting or a heavily manipulated photograph, appearing as a surreal portrait of a young woman. The composition is a close-up, focusing on the face. The woman's face is partially obscured by fragmented, cracked, light teal and off-white pieces resembling peeling paint or decaying skin. These fragments are irregularly shaped and layered, creating a sense of depth and texture. The woman's skin is subtly illuminated, with a warm, golden light highlighting her features, particularly her lips and eyes. Her eyes are a striking light blue, contrasting with the cool tones of the fragmented elements. The overall color palette is muted, with teal, beige, and golden hues dominating. The atmosphere is melancholic and mysterious, with a hint of ethereal beauty. The style is surreal and painterly, blending realistic portraiture with abstract elements. The vibe is introspective and unsettling, suggesting themes of vulnerability, fragility, and hidden identity. The lighting is dramatic, with a chiaroscuro effect emphasizing the texture and form of the fragmented elements"
140+
num_inference_steps=28,
141+
guidance_scale=4.0,
142+
height=512,
143+
width=512,
144+
generator=torch.Generator("cuda").manual_seed(42)
145+
).images[0]
146146
```
147147

148148
## Memory Optimization
@@ -153,7 +153,7 @@ For memory-constrained environments:
153153
import torch
154154
from diffusers.pipelines.photon import PhotonPipeline
155155

156-
pipe = PhotonPipeline.from_pretrained("Photoroom/photon-512-t2i", torch_dtype=torch.float16)
156+
pipe = PhotonPipeline.from_pretrained("Photoroom/photon-512-t2i", torch_dtype=torch.bfloat16)
157157
pipe.enable_model_cpu_offload() # Offload components to CPU when not in use
158158

159159
# Or use sequential CPU offload for even lower memory

0 commit comments

Comments
 (0)