Skip to content

ModelTC/Qwen-Image-Lightning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Qwen-Image-Lightning

We are excited to release the distilled version of Qwen-Image. It preserves the capability of complex text rendering.

🔥 Latest News

📑 Community Support

📑 Todo List

  • Qwen-Image-Lightning-8steps-V1.1
  • Qwen-Image-Lightning-8steps-V1.0
  • Qwen-Image-Lightning-4steps-V1.0
  • ComfyUI Workflow
  • Improve Quality
  • Qwen-Image-Edit-Lightning-8steps-V1.0
  • Qwen-Image-Edit-Lightning-4steps-V1.0
  • Qwen Edit ComfyUI Workflow

📑 T2I Performance Report

To assess the distilled models' performance characteristics, including their strengths and limitations, we compare the performance of the three models, i.e., Qwen-Image, Qwen-Image-Lightning-8steps-V1.1, and Qwen-Image-Lightning-4steps-V1.0, in different scenarios. The results can be reproduced following the section below.

- Quality and Speed

Compared to the base model, the distilled models (8-step and 4-step) deliver a 12–25× speed improvement with no significant loss in performance in most cases.

Prompt Base NFE=100 8steps-V1.1 NFE=8 4steps-V1.0 NFE=4
一个会议室,墙上写着"3.14159265-358979-32384626-4338327950",一个小陀螺在桌上转动。 111 112 113
宫崎骏的动漫风格。平视角拍摄,阳光下的古街热闹非凡。一个穿着青衫、手里拿着写着“阿里云”卡片的逍遥派弟子站在中间。旁边两个小孩惊讶的看着他。左边有一家店铺挂着“云存储”的牌子,里面摆放着发光的服务器机箱,门口两个侍卫守护者。右边有两家店铺,其中一家挂着“云计算”的牌子,一个穿着旗袍的美丽女子正看着里面闪闪发光的电脑屏幕;另一家店铺挂着“云模型”的牌子,门口放着一个大酒缸,上面写着“千问”,一位老板娘正在往里面倒发光的代码溶液。 121 122 123
一副典雅庄重的对联悬挂于厅堂之中,房间是个安静古典的中式布置,桌子上放着一些青花瓷,对联上左书“义本生知人机同道善思新”,右书“通云赋智乾坤启数高志远”, 横批“智启通义”,字体飘逸,中间挂在一着一副中国风的画作,内容是岳阳楼。 131 132 133
A movie poster. The first row is the movie title, which reads “Imagination Unleashed”. The second row is the movie subtitle, which reads “Enter a world beyond your imagination”. The third row reads “Cast: Qwen-Image”. The fourth row reads “Director: The Collective Imagination of Humanity”. The central visual features a sleek, futuristic computer from which radiant colors, whimsical creatures, and dynamic, swirling patterns explosively emerge, filling the composition with energy, motion, and surreal creativity. The background transitions from dark, cosmic tones into a luminous, dreamlike expanse, evoking a digital fantasy realm. At the bottom edge, the text “Launching in the Cloud, August 2025” appears in bold, modern sans-serif font with a glowing, slightly transparent effect, evoking a high-tech, cinematic aesthetic. The overall style blends sci-fi surrealism with graphic design flair—sharp contrasts, vivid color grading, and layered visual depth—reminiscent of visionary concept art and digital matte painting, 32K resolution, ultra-detailed. 141 142 143
一张企业级高质量PPT页面图像,整体采用科技感十足的星空蓝为主色调,背景融合流动的发光科技线条与微光粒子特效,营造出专业、现代且富有信任感的品牌氛围;页面顶部左侧清晰展示橘红色Alibaba标志,色彩鲜明、辨识度高。主标题位于画面中央偏上位置,使用大号加粗白色或浅蓝色字体写着“通义千问视觉基础模型”,字体现代简洁,突出技术感;主标题下方紧接一行楷体中文文字:“原生中文·复杂场景·自动布局”,字体柔和优雅,形成科技与人文的融合。下方居中排布展示了四张与图片,分别是:一幅写实与水墨风格结合的梅花特写,枝干苍劲、花瓣清雅,背景融入淡墨晕染与飘雪效果,体现坚韧不拔的精神气质;上方写着黑色的楷体"梅傲"。一株生长于山涧石缝中的兰花,叶片修长、花朵素净,搭配晨雾缭绕的自然环境,展现清逸脱俗的文人风骨;上方写着黑色的楷体"兰幽"。一组迎风而立的翠竹,竹叶随风摇曳,光影交错,背景为青灰色山岩与流水,呈现刚柔并济、虚怀若谷的文化意象;上方写着黑色的楷体"竹清"。一片盛开于秋日庭院的菊花丛,花色丰富、层次分明,配以落叶与古亭剪影,传递恬然自适的生活哲学;上方写着黑色的楷体"菊淡"。所有图片采用统一尺寸与边框样式,呈横向排列。页面底部中央用楷体小字写明“2025年8月,敬请期待”,排版工整、结构清晰,整体风格统一且细节丰富,极具视觉冲击力与品牌调性。 151 152 153

- Dense or Small Text Rendering

In scenarios involving dense or small text, the base model is more likely to produce better results.

Prompt Base NFE=100 8steps-V1.1 NFE=8 4steps-V1.0 NFE=4
一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。她身后的玻璃板上手写体写着 “一、Qwen-Image的技术路线: 探索视觉生成基础模型的极限,开创理解与生成一体化的未来。二、Qwen-Image的模型特色:1、复杂文字渲染。支持中英渲染、自动布局; 2、精准图像编辑。支持文字编辑、物体增减、风格变换。三、Qwen-Image的未来愿景:赋能专业内容创作、助力生成式AI发展。” 211 212 213

- Hair-like Details

In scenes containing hair-like details, the base model demonstrates superior rendering fidelity, whereas the distilled models may yield outputs that appear either noticeably blurred or excessively sharpened.

Prompt Base NFE=100 8steps-V1.1 NFE=8 4steps-V1.0 NFE=4
A capybara wearing a suit holding a sign that reads Hello World. 311 312 313

- Highly Complex Scenes

In highly complex scenes, all three models may fail to produce satisfactory results.

Prompt Base NFE=100 8steps-V1.1 NFE=8 4steps-V1.0 NFE=4
"A vibrant, warm neon-lit street scene in Hong Kong at the afternoon, with a mix of colorful Chinese and English signs glowing brightly. The atmosphere is lively, cinematic, and rain-washed with reflections on the pavement. The colors are vivid, full of pink, blue, red, and green hues. Crowded buildings with overlapping neon signs. 1980s Hong Kong style. Signs include: "龍鳳冰室" "金華燒臘" "HAPPY HAIR" "鴻運茶餐廳" "EASY BAR" "永發魚蛋粉" "添記粥麵" "SUNSHINE MOTEL" "美都餐室" "富記糖水" "太平館" "雅芳髮型屋" "STAR KTV" "銀河娛樂城" "百樂門舞廳" "BUBBLE CAFE" "萬豪麻雀館" "CITY LIGHTS BAR" "瑞祥香燭莊" "文記文具" "GOLDEN JADE HOTEL" "LOVELY BEAUTY" "合興百貨" "興旺電器" And the background is warm yellow street and with all stores' lights on. 411 412 413

- Inconsistencies in Model Rankings Across Test Cases

Test results may vary across different cases. In certain test instances, the base model may perform better, whereas in others, the distilled models may achieve superior results. Even for the same prompt at different resolutions, the relative performance ranking of the models may differ substantially.

Prompt Base NFE=100 8steps-V1.1 NFE=8 4steps-V1.0 NFE=4
A young girl wearing school uniform stands in a classroom, writing on a chalkboard. The text "Introducing Qwen-Image, a foundational image generation model that excels in complex text rendering and precise image editing" appears in neat white chalk at the center of the blackboard. Soft natural light filters through windows, casting gentle shadows. The scene is rendered in a realistic photography style with fine details, shallow depth of field, and warm tones. The girl's focused expression and chalk dust in the air add dynamism. Background elements include desks and educational posters, subtly blurred to emphasize the central action. Ultra-detailed 32K resolution, DSLR-quality, soft bokeh effect, documentary-style composition. 511 512 513
A coffee shop entrance features a chalkboard sign reading "Qwen Coffee 😊 $2 per cup," with a neon light beside it displaying "通义千问". Next to it hangs a poster showing a beautiful Chinese woman, and beneath the poster is written "π≈3.1415926-53589793-23846264-33832795-02384197". 611 612 613
A coffee shop entrance features a chalkboard sign reading "Qwen Coffee 😊 $2 per cup," with a neon light beside it displaying "通义千问". Next to it hangs a poster showing a beautiful Chinese woman, and beneath the poster is written "π≈3.1415926-53589793-23846264-33832795-02384197". 621 622 623

📑 Editing Performance Report

We compare the performance of the three models, i.e., Qwen-Image-Edit-Diffusers, Qwen-Image-Edit-Lightning-8steps-V1.0, and Qwen-Image-Edit-Lightning-4steps-V1.0, in different scenarios. The results can be reproduced following the section below.

Input Image Prompt Base Edit NFE=100 8steps-V1.0 NFE=8 4steps-V1.0 NFE=4
111 Replace the words 'HEALTH INSURANCE' on the letter blocks with 'Tomorrow will be better'. 112 113 114
Bad case: the first "m" appears as "mn" due to an extra stroke. Bad case: the letter "o" is missing.
121 Replace the words 'HEALTH INSURANCE' on the letter blocks with '明天会更好'. 122 123 124
Bad case: an extra "更" is generated.
131 Replace the polka-dot shirt with a light blue shirt. 132 133 134
141 Remove the hair from the plate. 142 143 144
151 Generate a cartoon profile picture of the person. 152 153 154
161 Transform the character in the image into an anime style, and add the text: "Accelerate image generation and editing with Lightx2V Qwen-Image-Lightning". 162 163 164
Bad case: incorrect spelling "Lightx2V". Bad case: incorrect spelling "editing with". Failure case.
171 将图中的人物改为日漫风格,并给图片添加文字“使用Lightx2V Qwen-Image-Lightning 加速图像生成和图片编辑”。 172 173 174
181 将图中红色框中的文字改为"殇",只改变框内的画面,框外的画面维持不变。 182 183 184

🚀 Run Evaluation and Test

Installation

Please follow Qwen-Image to install the Python Environment, e.g., diffusers v0.35.1, and download the Base Model.

Model Download

Download models using huggingface-cli:

pip install "huggingface_hub[cli]"
huggingface-cli download lightx2v/Qwen-Image-Lightning --local-dir ./Qwen-Image-Lightning

Run 8-step Model

# 8 steps, cfg 1.0
python generate_with_diffusers.py \
--prompt_list_file examples/prompt_list.txt \
--out_dir test_lora_8_step_results \
--lora_path Qwen-Image-Lightning/Qwen-Image-Lightning-8steps-V1.0.safetensors \
--base_seed 42 --steps 8 --cfg 1.0

Run 4-step Model

# 4 steps, cfg 1.0
python generate_with_diffusers.py \
--prompt_list_file examples/prompt_list.txt \
--out_dir test_lora_4_step_results \
--lora_path Qwen-Image-Lightning/Qwen-Image-Lightning-4steps-V1.0.safetensors \
--base_seed 42 --steps 4 --cfg 1.0

Run base Model

# 50 steps, cfg 4.0
python generate_with_diffusers.py \
--prompt_list_file examples/prompt_list.txt \
--out_dir test_base_results \
--base_seed 42 --steps 50 --cfg 4.0

Run 8-step Edit Model

# 8 steps, cfg 1.0
python generate_with_diffusers.py \
--prompt_list_file examples/edit_prompt_list.txt \
--image_path_list_file examples/image_path_list.txt \
--model_name Qwen/Qwen-Image-Edit \
--out_dir test_lora_8_step_edit_results \
--lora_path Qwen-Image-Lightning/Qwen-Image-Edit-Lightning-8steps-V1.0.safetensors \
--base_seed 42 --steps 8 --cfg 1.0

Run 4-step Edit Model

# 4 steps, cfg 1.0
python generate_with_diffusers.py \
--prompt_list_file examples/edit_prompt_list.txt \
--image_path_list_file examples/image_path_list.txt \
--model_name Qwen/Qwen-Image-Edit \
--out_dir test_lora_4_step_edit_results \
--lora_path Qwen-Image-Lightning/Qwen-Image-Edit-Lightning-4steps-V1.0.safetensors \
--base_seed 42 --steps 4 --cfg 1.0

Run Base Edit Model

# 50 steps, cfg 4.0
python generate_with_diffusers.py \
--prompt_list_file examples/edit_prompt_list.txt \
--image_path_list_file examples/image_path_list.txt \
--model_name Qwen/Qwen-Image-Edit \
--out_dir test_base_edit_results \
--base_seed 42 --steps 50 --cfg 4.0

🎨 ComfyUI Workflow

ComfyUI workflow is available in the workflows/ directory.

  • The Qwen-Image workflow is based on the Qwen-Image ComfyUI tutorial and has been verified with ComfyUI repository at commit ID 37d620a6b85f61b824363ed8170db373726ca45a.

  • The Qwen-Image-Edit workflow is based on the Qwen-Image-Edit ComfyUI tutorial. We noticed a gap in performance compared to diffusers inference, which may stem from differences in how ComfyUI and diffusers handle the processing.

Workflow Files

  • workflows/qwen-image-8steps.json - 8-step lightning workflow for Qwen-Image
  • workflows/qwen-image-4steps.json - 4-step lightning workflow for Qwen-Image
  • workflows/qwen-image-edit-8steps.json - 8-step lightning workflow for Qwen-Image-Edit
  • workflows/qwen-image-edit-4steps.json - 4-step lightning workflow for Qwen-Image-Edit

Usage

  1. Install ComfyUI following the official instructions
  2. Download and place the Qwen-Image or Qwen-Image-Edit base model following the Qwen-Image ComfyUI tutorial, Qwen-Image-Edit ComfyUI tutorial (include UNet/CLIP/VAE files into proper ComfyUI folders)
  3. For Qwen Image workflows:
    • 8-step: Load workflows/qwen-image-8steps.json, put Qwen-Image-Lightning-8steps-V1.0.safetensors into ComfyUI/models/loras/, and set KSampler steps to 8
    • 4-step: Load workflows/qwen-image-4steps.json, put Qwen-Image-Lightning-4steps-V1.0.safetensors into ComfyUI/models/loras/, and set KSampler steps to 4
  4. For Qwen Image Edit workflows:
    • 8-step: Load workflows/qwen-image-edit-8steps.json, put Qwen-Image-Edit-Lightning-8steps-V1.0.safetensors into ComfyUI/models/loras/, and set KSampler steps to 8
    • 4-step: Load workflows/qwen-image-edit-4steps.json, put Qwen-Image-Edit-Lightning-4steps-V1.0.safetensors into ComfyUI/models/loras/, and set KSampler steps to 4
  5. Run the workflow to generate images

License Agreement

The models in this repository are licensed under the Apache 2.0 License. We claim no rights over your generated contents, granting you the freedom to use them while ensuring that your usage complies with the provisions of this license. You are fully accountable for your use of the models, which must not involve sharing any content that violates applicable laws, causes harm to individuals or groups, disseminates personal information intended for harm, spreads misinformation, or targets vulnerable populations. For a complete list of restrictions and details regarding your rights, please refer to the full text of the license.

Acknowledgements

We built upon and reused code from the following projects: Qwen-Image, licensed under the Apache License 2.0.

The evaluation text prompts are from Qwen-Image, Qwen-Image Blog and Qwen-Image-Service.

The test cases for Image Editing are from Qwen-Image-Edit-api and reddit.

Star History

Star History Chart

About

Qwen-Image-Lightning: Speed up Qwen-Image model with distillation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages