|
1 | | -# Run |
| 1 | +# Usage |
2 | 2 |
|
3 | | -``` |
4 | | -usage: ./bin/sd-cli [options] |
5 | | -
|
6 | | -CLI Options: |
7 | | - -o, --output <string> path to write result image to. you can use printf-style %d format specifiers for image |
8 | | - sequences (default: ./output.png) (eg. output_%03d.png). Single-file video outputs |
9 | | - support .avi, .webm, and animated .webp |
10 | | - --image <string> path to the image to inspect (for metadata mode) |
11 | | - --metadata-format <string> metadata output format, one of [text, json] (default: text) |
12 | | - --preview-path <string> path to write preview image to (default: ./preview.png). Multi-frame previews support |
13 | | - .avi, .webm, and animated .webp |
14 | | - --preview-interval <int> interval in denoising steps between consecutive updates of the image preview file |
15 | | - (default is 1, meaning updating at every step) |
16 | | - --output-begin-idx <int> starting index for output image sequence, must be non-negative (default 0 if specified |
17 | | - %d in output path, 1 otherwise) |
18 | | - --canny apply canny preprocessor (edge detection) |
19 | | - --convert-name convert tensor name (for convert mode) |
20 | | - -v, --verbose print extra info |
21 | | - --color colors the logging tags according to level |
22 | | - --taesd-preview-only prevents usage of taesd for decoding the final image. (for use with --preview tae) |
23 | | - --preview-noisy enables previewing noisy inputs of the models rather than the denoised outputs |
24 | | - --metadata-raw include raw hex previews for unparsed metadata payloads |
25 | | - --metadata-brief truncate long metadata text values in text output |
26 | | - --metadata-all include structural/container entries such as IHDR, IDAT, and non-metadata JPEG segments |
27 | | - -M, --mode run mode, one of [img_gen, vid_gen, upscale, convert, metadata], default: img_gen |
28 | | - --preview preview method. must be one of the following [none, proj, tae, vae] (default is none) |
29 | | - -h, --help show this help message and exit |
| 3 | +For detailed command-line arguments, run: |
30 | 4 |
|
31 | | -Context Options: |
32 | | - -m, --model <string> path to full model |
33 | | - --clip_l <string> path to the clip-l text encoder |
34 | | - --clip_g <string> path to the clip-g text encoder |
35 | | - --clip_vision <string> path to the clip-vision encoder |
36 | | - --t5xxl <string> path to the t5xxl text encoder |
37 | | - --llm <string> path to the llm text encoder. For example: (qwenvl2.5 for qwen-image, |
38 | | - mistral-small3.2 for flux2, ...) |
39 | | - --llm_vision <string> path to the llm vit |
40 | | - --qwen2vl <string> alias of --llm. Deprecated. |
41 | | - --qwen2vl_vision <string> alias of --llm_vision. Deprecated. |
42 | | - --diffusion-model <string> path to the standalone diffusion model |
43 | | - --high-noise-diffusion-model <string> path to the standalone high noise diffusion model |
44 | | - --uncond-diffusion-model <string> path to the standalone unconditional diffusion model, currently used by |
45 | | - Ideogram4 CFG |
46 | | - --vae <string> path to standalone vae model |
47 | | - --taesd <string> path to taesd. Using Tiny AutoEncoder for fast decoding (low quality) |
48 | | - --tae <string> alias of --taesd |
49 | | - --control-net <string> path to control net model |
50 | | - --embd-dir <string> embeddings directory |
51 | | - --lora-model-dir <string> lora model directory |
52 | | - --hires-upscalers-dir <string> highres fix upscaler model directory |
53 | | - --tensor-type-rules <string> weight type per tensor pattern (example: "^vae\.=f16,model\.=q8_0") |
54 | | - --photo-maker <string> path to PHOTOMAKER model |
55 | | - --upscale-model <string> path to esrgan model. |
56 | | - -t, --threads <int> number of threads to use during computation (default: -1). If threads <= 0, |
57 | | - then threads will be set to the number of CPU physical cores |
58 | | - --chroma-t5-mask-pad <int> t5 mask pad size of chroma |
59 | | - --max-vram <float> maximum VRAM budget in GiB for graph-cut segmented execution. 0 disables |
60 | | - graph splitting; a negative value auto-detects free VRAM, sparing the |
61 | | - specified value (e.g. -0.5 will keep at least 0.5 GiB free) |
62 | | - --force-sdxl-vae-conv-scale force use of conv scale on sdxl vae |
63 | | - --offload-to-cpu place the weights in RAM to save VRAM, and automatically load them into VRAM |
64 | | - when needed |
65 | | - --mmap whether to memory-map model |
66 | | - --control-net-cpu deprecated; use --backend controlnet=cpu |
67 | | - --clip-on-cpu deprecated; use --backend te=cpu |
68 | | - --vae-on-cpu deprecated; use --backend vae=cpu |
69 | | - --fa use flash attention |
70 | | - --diffusion-fa use flash attention in the diffusion model only |
71 | | - --diffusion-conv-direct use ggml_conv2d_direct in the diffusion model |
72 | | - --vae-conv-direct use ggml_conv2d_direct in the vae model |
73 | | - --circular enable circular padding for convolutions |
74 | | - --circularx enable circular RoPE wrapping on x-axis (width) only |
75 | | - --circulary enable circular RoPE wrapping on y-axis (height) only |
76 | | - --chroma-disable-dit-mask disable dit mask for chroma |
77 | | - --qwen-image-zero-cond-t enable zero_cond_t for qwen image |
78 | | - --chroma-enable-t5-mask enable t5 mask for chroma |
79 | | - --type weight type (examples: f32, f16, q4_0, q4_1, q5_0, q5_1, q8_0, q2_K, q3_K, |
80 | | - q4_K). If not specified, the default is the type of the weight file |
81 | | - --rng RNG, one of [std_default, cuda, cpu], default: cuda(sd-webui), cpu(comfyui) |
82 | | - --sampler-rng sampler RNG, one of [std_default, cuda, cpu]. If not specified, use --rng |
83 | | - --prediction prediction type override, one of [eps, v, edm_v, sd3_flow, flux_flow, |
84 | | - flux2_flow] |
85 | | - --lora-apply-mode the way to apply LoRA, one of [auto, immediately, at_runtime], default is |
86 | | - auto. In auto mode, if the model weights contain any quantized parameters, |
87 | | - the at_runtime mode will be used; otherwise, immediately will be used.The |
88 | | - immediately mode may have precision and compatibility issues with quantized |
89 | | - parameters, but it usually offers faster inference speed and, in some cases, |
90 | | - lower memory usage. The at_runtime mode, on the other hand, is exactly the |
91 | | - opposite. |
92 | | -
|
93 | | -Generation Options: |
94 | | - -p, --prompt <string> the prompt to render |
95 | | - -n, --negative-prompt <string> the negative prompt (default: "") |
96 | | - -i, --init-img <string> path to the init image |
97 | | - --end-img <string> path to the end image, required by flf2v |
98 | | - --mask <string> path to the mask image |
99 | | - --control-image <string> path to control image, control net |
100 | | - --control-video <string> path to control video frames, It must be a directory path. The video frames |
101 | | - inside should be stored as images in lexicographical (character) order. For |
102 | | - example, if the control video path is `frames`, the directory contain images |
103 | | - such as 00.png, 01.png, ... etc. |
104 | | - --pm-id-images-dir <string> path to PHOTOMAKER input id images dir |
105 | | - --pm-id-embed-path <string> path to PHOTOMAKER v2 id embed |
106 | | - --hires-upscaler <string> highres fix upscaler, Lanczos, Nearest, Latent, Latent (nearest), Latent |
107 | | - (nearest-exact), Latent (antialiased), Latent (bicubic), Latent (bicubic |
108 | | - antialiased), or a model name under --hires-upscalers-dir (default: Latent) |
109 | | - --extra-sample-args <string> extra sampler/scheduler/guidance args, key=value list. APG supports apg_eta, |
110 | | - apg_momentum, apg_norm_threshold, apg_norm_threshold_smoothing; SLG supports |
111 | | - slg_uncond; lcm supports noise_clip_std, noise_scale_start, noise_scale_end; |
112 | | - ltx2 supports max_shift, base_shift, stretch, terminal; euler_ge supports gamma |
113 | | - --extra-tiling-args <string> extra VAE tiling args, key=value list. LTX video VAE supports |
114 | | - temporal_tile_frames (default: 4), temporal_tile_overlap (default: 1) |
115 | | - -H, --height <int> image height, in pixel space (default: 512) |
116 | | - -W, --width <int> image width, in pixel space (default: 512) |
117 | | - --steps <int> number of sample steps (default: 20) |
118 | | - --high-noise-steps <int> (high noise) number of sample steps (default: -1 = auto) |
119 | | - --clip-skip <int> ignore last layers of CLIP network; 1 ignores none, 2 ignores one layer |
120 | | - (default: -1). <= 0 represents unspecified, will be 1 for SD1.x, 2 for SD2.x |
121 | | - -b, --batch-count <int> batch count |
122 | | - --video-frames <int> video frames (default: 1) |
123 | | - --fps <int> fps (default: 24) |
124 | | - --timestep-shift <int> shift timestep for NitroFusion models (default: 0). recommended N for |
125 | | - NitroSD-Realism around 250 and 500 for NitroSD-Vibrant |
126 | | - --upscale-repeats <int> Run the ESRGAN upscaler this many times (default: 1) |
127 | | - --upscale-tile-size <int> tile size for ESRGAN upscaling (default: 128) |
128 | | - --hires-width <int> highres fix target width, 0 to use --hires-scale (default: 0) |
129 | | - --hires-height <int> highres fix target height, 0 to use --hires-scale (default: 0) |
130 | | - --hires-steps <int> highres fix second pass sample steps, 0 to reuse --steps (default: 0) |
131 | | - --hires-upscale-tile-size <int> highres fix upscaler tile size, reserved for model-backed upscalers (default: |
132 | | - 128) |
133 | | - --cfg-scale <float> unconditional guidance scale: (default: 7.0) |
134 | | - --img-cfg-scale <float> image guidance scale for inpaint or image edit models: (default: same as |
135 | | - --cfg-scale) |
136 | | - --guidance <float> distilled guidance scale for models with guidance input (default: 3.5) |
137 | | - --slg-scale <float> skip layer guidance (SLG) scale, only for DiT models: (default: 0). 0 means |
138 | | - disabled, a value of 2.5 is nice for sd3.5 medium |
139 | | - --skip-layer-start <float> SLG enabling point (default: 0.01) |
140 | | - --skip-layer-end <float> SLG disabling point (default: 0.2) |
141 | | - --eta <float> noise multiplier (default: 0 for ddim_trailing, tcd, res_multistep and |
142 | | - res_2s; 1 for euler_a, er_sde and dpm++2s_a) |
143 | | - --flow-shift <float> shift value for Flow models like SD3.x or WAN (default: auto) |
144 | | - --high-noise-cfg-scale <float> (high noise) unconditional guidance scale: (default: 7.0) |
145 | | - --high-noise-img-cfg-scale <float> (high noise) image guidance scale for inpaint or image edit models (default: |
146 | | - same as --cfg-scale) |
147 | | - --high-noise-guidance <float> (high noise) distilled guidance scale for models with guidance input |
148 | | - (default: 3.5) |
149 | | - --high-noise-slg-scale <float> (high noise) skip layer guidance (SLG) scale, only for DiT models: (default: |
150 | | - 0) |
151 | | - --high-noise-skip-layer-start <float> (high noise) SLG enabling point (default: 0.01) |
152 | | - --high-noise-skip-layer-end <float> (high noise) SLG disabling point (default: 0.2) |
153 | | - --high-noise-eta <float> (high noise) noise multiplier (default: 0 for ddim_trailing, tcd, |
154 | | - res_multistep and res_2s; 1 for euler_a, er_sde and dpm++2s_a) |
155 | | - --strength <float> strength for noising/unnoising (default: 0.75) |
156 | | - --pm-style-strength <float> |
157 | | - --control-strength <float> strength to apply Control Net (default: 0.9). 1.0 corresponds to full |
158 | | - destruction of information in init image |
159 | | - --moe-boundary <float> timestep boundary for Wan2.2 MoE model. (default: 0.875). Only enabled if |
160 | | - `--high-noise-steps` is set to -1 |
161 | | - --vace-strength <float> wan vace strength |
162 | | - --vae-tile-overlap <float> tile overlap for vae tiling, in fraction of tile size (default: 0.5) |
163 | | - --hires-scale <float> highres fix scale when target size is not set (default: 2.0) |
164 | | - --hires-denoising-strength <float> highres fix second pass denoising strength (default: 0.7) |
165 | | - --increase-ref-index automatically increase the indices of references images based on the order |
166 | | - they are listed (starting with 1). |
167 | | - --disable-auto-resize-ref-image disable auto resize of ref images |
168 | | - --disable-image-metadata do not embed generation metadata on image files |
169 | | - --vae-tiling process vae in tiles to reduce memory usage |
170 | | - --temporal-tiling enable temporal tiling for LTX video VAE decode |
171 | | - --hires enable highres fix |
172 | | - -s, --seed RNG seed (default: 42, use random seed for < 0) |
173 | | - --sampling-method sampling method, one of [euler, euler_a, heun, dpm2, dpm++2s_a, dpm++2m, |
174 | | - dpm++2mv2, ipndm, ipndm_v, lcm, ddim_trailing, tcd, res_multistep, res_2s, |
175 | | - er_sde, euler_cfg_pp, euler_a_cfg_pp] (default: euler for Flux/SD3/Wan, euler_a otherwise) |
176 | | - --high-noise-sampling-method (high noise) sampling method, one of [euler, euler_a, heun, dpm2, dpm++2s_a, |
177 | | - dpm++2m, dpm++2mv2, ipndm, ipndm_v, lcm, ddim_trailing, tcd, res_multistep, |
178 | | - res_2s, er_sde, euler_cfg_pp, euler_a_cfg_pp] default: euler for Flux/SD3/Wan, euler_a otherwise |
179 | | - --scheduler denoiser sigma scheduler, one of [discrete, karras, exponential, ays, gits, |
180 | | - smoothstep, sgm_uniform, simple, kl_optimal, lcm, bong_tangent, ltx2], default: |
181 | | - model-specific |
182 | | - --sigmas custom sigma values for the sampler, comma-separated (e.g., |
183 | | - "14.61,7.8,3.5,0.0"). |
184 | | - --hires-sigmas custom sigma values for the highres fix second pass, comma-separated (e.g., |
185 | | - "0.85,0.725,0.421875,0.0"). |
186 | | - --skip-layers layers to skip for SLG steps (default: [7,8,9]) |
187 | | - --high-noise-skip-layers (high noise) layers to skip for SLG steps (default: [7,8,9]) |
188 | | - -r, --ref-image reference image for Flux Kontext models (can be used multiple times) |
189 | | - --cache-mode caching method: 'easycache' (DiT), 'ucache' (UNET), |
190 | | - 'dbcache'/'taylorseer'/'cache-dit' (DiT block-level), 'spectrum' (UNET/DiT |
191 | | - Chebyshev+Taylor forecasting) |
192 | | - --cache-option named cache params (key=value format, comma-separated). easycache/ucache: |
193 | | - threshold=,start=,end=,decay=,relative=,reset=; dbcache/taylorseer/cache-dit: |
194 | | - Fn=,Bn=,threshold=,warmup=; spectrum: w=,m=,lam=,window=,flex=,warmup=,stop=. |
195 | | - Examples: "threshold=0.25" or "threshold=1.5,reset=0" |
196 | | - --scm-mask SCM steps mask for cache-dit: comma-separated 0/1 (e.g., |
197 | | - "1,1,1,0,0,1,0,0,1,0") - 1=compute, 0=can cache |
198 | | - --scm-policy SCM policy: 'dynamic' (default) or 'static' |
199 | | - --vae-tile-size tile size for vae tiling, format [X]x[Y] (default: 32x32) |
200 | | - --vae-relative-tile-size relative tile size for vae tiling, format [X]x[Y], in fraction of image size |
201 | | - if < 1, in number of tiles per dim if >=1 (overrides --vae-tile-size) |
| 5 | +```bash |
| 6 | +./bin/sd-cli -h |
202 | 7 | ``` |
203 | 8 |
|
204 | 9 | Metadata mode inspects PNG/JPEG container metadata without loading any model: |
|
0 commit comments