feat(diffusers/pipelines): add pipelines and required modules of QwenImage in Diffusers Master #1288

Dong1017 · 2025-09-17T06:53:22Z

What does this PR do?

Adds

1. QwenImage Pipelines and Required Modules

(Comparable with Diffusers Master)

a. Pipelines

mindone.diffusers.QwenImagePipeline
mindone.diffusers.QwenImageImg2ImgPipeline
mindone.diffusers.QwenImageInpaintPipeline
mindone.diffusers.QwenImageEditPipeline
mindone.diffusers.QwenImageEditInpaintPipeline

b. Modules

mindone.diffusers.models.AutoencoderQwenImage
mindone.diffusers.models.QwenImageTransformer2DModel
mindone.diffusers.loaders.QwenImageLoraLoaderMixin

2. add UTs of pipelines

All UTs were setup according to Diffusers Master, accessed in Sep 17, 2025.
- tests/diffusers_tests/pipelines/qwenimage/test_qwenimage.py
- tests/diffusers_tests/pipelines/qwenimage/test_qwenimage_img2img.py
- tests/diffusers_tests/pipelines/qwenimage/test_qwenimage_inpaint.py
- tests/diffusers_tests/pipelines/qwenimage/test_qwenimage_edit.py
Using MindSpore 2.7.0 can pass both fp32 and bf16 UTs.
Using MindSpore 2.6.0 can pass bf16 UTs, while fp32 will happen to TypeError.

Usage

QwenImagePipeline

import mindspore as ms 
from mindone.diffusers import QwenImagePipeline 

pipe = QwenImagePipeline.from_pretrained("Qwen/Qwen-Image", mindspore_dtype=ms.bfloat16) 
prompt = "A cat holding a sign that says hello world" 
# Depending on the variant being used, the pipeline call will slightly vary. 
# Refer to the pipeline documentation for more details. 
image = pipe(prompt, num_inference_steps=50)[0][0] 
image.save("qwenimage.png")

QwenImageImg2ImgPipeline

import mindspore as ms 
from mindone.diffusers import QwenImageImg2ImgPipeline
from mindone.diffusers.utils import load_image

pipe = QwenImageImg2ImgPipeline.from_pretrained("Qwen/Qwen-Image", mindspore_dtype=mindspore.bfloat16)
url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
init_image = load_image(url).resize((1024, 1024))
prompt = "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney"
images = pipe(prompt=prompt, negative_prompt=" ", image=init_image, strength=0.95)[0][0]
images.save("qwenimage_img2img.png")

QwenImageInpaintPipeline

import mindspore as ms 
from mindone.diffusers import QwenImageInpaintPipeline 
from mindone.diffusers.utils import load_image 

pipe = QwenImageInpaintPipeline.from_pretrained("Qwen/Qwen-Image", mindspore_dtype=ms.bfloat16) 
prompt = "Face of a yellow cat, high resolution, sitting on a park bench" 
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" 
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" 
source = load_image(img_url) 
mask = load_image(mask_url) 
image = pipe(prompt=prompt, negative_prompt=" ", image=source, mask_image=mask, strength=0.85)[0][0] 
image.save("qwenimage_inpainting.png")

QwenImageEditPipeline

import mindspore as ms 
from PIL import Image 
from mindone.diffusers import QwenImageEditPipeline 
from mindone.diffusers.utils import load_image 

pipe = QwenImageEditPipeline.from_pretrained("Qwen/Qwen-Image-Edit", mindspore_dtype=ms.bfloat16) 
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/yarn-art-pikachu.png").convert("RGB") 
prompt = ("Make Pikachu hold a sign that says 'Qwen Edit is awesome', yarn art style, detailed, vibrant colors") 
# Depending on the variant being used, the pipeline call will slightly vary. 
# Refer to the pipeline documentation for more details. 
image = pipe(image, prompt, num_inference_steps=50)[0][0] 
image.save("qwenimage_edit.png")

QwenImageEditInpaintPipeline

import mindspore as ms 
from PIL import Image
from mindone.diffusers import QwenImageEditInpaintPipeline
from mindone.diffusers.utils import load_image

pipe = QwenImageEditInpaintPipeline.from_pretrained("Qwen/Qwen-Image-Edit", mindspore_dtype=mindspore.bfloat16)
prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
source = load_image(img_url)
mask = load_image(mask_url)
image = pipe(prompt=prompt, negative_prompt=" ", image=source, mask_image=mask, strength=1.0, num_inference_steps=50)[0][0]
image.save("qwenimage_inpainting.png")

Performance

Experiments are tested on Ascend Atlas 800T A2 machines with MindSpore 2.7.0

Pipeline	Weight Loading Time	Mode	Speed
QwenImagePipeline	15m21s	Pynative	9.93 s/it
QwenImageImg2ImgPipeline	14m57s	Pynative	9.56 s/it
QwenImageInpaintPipeline	10m10s	Pynative	4.80 s/it
QwenImageEditPipeline	13m57s	Pynative	13.25 s/it
QwenImageEditInpaintPipeline	13m20s	Pynative	13.98 s/it

Limitation

QwenImageEditPipeline and QwenImageEditInpaintPipeline will load modules from Qwen-Image-Edit. The use of these two pipes requires manually changing image_processor_type from Qwen2VLImageProcessorFast to Qwen2VLImageProcessor in Qwen-Image-Edit/processor/preprocessor_config.json

Notes

require transformers==4.52.1
The produced pictures are nearly identical to those by Torch, when setting consistent random seed and hidden states from the text encoder.
TODO: jit mode; LORA test; UTs of modules

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you make sure to update the documentation with your changes? E.g. record bug fixes or new features in What's New. Here are the
documentation guidelines
Did you build and run the code without any errors?
Did you report the running environment (NPU type/MS version) and performance in the doc? (better record it for data loading, model inference, or training tasks)
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@xxx

docs/diffusers/api/models/autoencoderkl_qwenimage.md

docs/diffusers/api/pipelines/qwenimage.md

mindone/diffusers/loaders/__init__.py

mindone/diffusers/loaders/lora_pipeline.py

mindone/diffusers/pipelines/qwenimage/pipeline_qwenimage.py

tests/diffusers_tests/pipelines/qwenimage/test_qwenimage_img2img.py

tests/diffusers_tests/pipelines/qwenimage/test_qwenimage_edit.py

SamitHuang · 2025-09-21T11:31:33Z

How to fix the requirement of transformers==4.52.1?

mindone/diffusers/pipelines/qwenimage/pipeline_qwenimage.py

SamitHuang · 2025-09-21T11:57:31Z

Can add an inference example and lora fine-tune example in examples folder, which helps introduce QwenImage

… SamitHuang

Dong1017 · 2025-09-26T09:03:38Z

How to fix the requirement of transformers==4.52.1?

The main reason for using transformers==4.52.1 rather than transformers==4.50.0 is to avoid AttributeError and keep consistent with the requirements from Qwen-Image.
Using transformers==4.50.0 will raise the following AttributeError:

../../transformers/src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py:1517: in __init__
    super().__init__(config)
../../transformers/src/transformers/modeling_utils.py:1898: in __init__
    self.generation_config = GenerationConfig.from_model_config(config) if self.can_generate() else None
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        decoder_config = model_config.get_text_config(decoder=True)
        if decoder_config is not model_config:
            default_generation_config = GenerationConfig()
>           decoder_config_dict = decoder_config.to_dict()
                                  ^^^^^^^^^^^^^^^^^^^^^^
E           AttributeError: 'dict' object has no attribute 'to_dict'

../../transformers/src/transformers/generation/configuration_utils.py:1287: AttributeError

Upgrading transformers from 4.50.0 to 4.52.1 or highr version will solve this error.

…qwenimage

Dong1017 added 30 commits August 15, 2025 14:14

2025/08/15

d3dec44

2025/8/15 17:18 revised

6006960

2025/8/18 10:22 revised

15bc8ae

2025/8/18 17:00 revised

103db50

2025/8/18 19:08 revised

77779b5

2025/8/18 19:13 revised

0cab22b

2025/8/19 9:02 revised

2b7b4c9

2025/8/19 9:04 revised

d7eaa37

2025/8/19 9:12 revised

dddd8f2

2025/8/19 10:27 revised

3117bdc

2025/8/20 9:22 revised

e19c2e3

2025/8/20 9:247 revised

0fb127a

2025/8/20 9:48 revised

5d317bc

2025/8/20 9:52 revised

e8043d8

2025/8/20 10:15 revised

b78ef0a

2025/8/20 10:50 revised

656acce

2025/8/20 11:11 revised

9a33d83

2025/8/20 11:27 revised

c2f972c

2025/8/20 11:47 revised

9e2cccf

2025/8/20 14:25 revised

9b5be21

2025/8/20 14:26 revised

1906919

2025/8/21 15:20 revised

c3055ba

2025/8/21 15:24 revised

e025800

2025/8/21 17:08 revised

436ebf3

2025/8/21 17:57 revised

dafec1a

2025/8/21 19:13 revised

e573be1

2025/8/22 11:32 revised

d549ab2

2025/8/22 17:40 revised

09ac0bd

2025/8/25 10:40 revised

fb5877b

2025/8/26 10:30 revised

358b20b

Dong1017 added 3 commits September 17, 2025 18:18

required lines but conflicting

9ce5c35

Merge branch 'qwenimage' of github.com:Dong1017/mindone into qwenimage

854ac0c

required lines but conflicting

f770d50

Cui-yshoho reviewed Sep 18, 2025

View reviewed changes

fix: md, according to Cui-yshoho

7d5da80

SamitHuang reviewed Sep 21, 2025

View reviewed changes

mindone/diffusers/pipelines/qwenimage/pipeline_qwenimage.py Outdated Show resolved Hide resolved

mindone/diffusers/pipelines/qwenimage/pipeline_qwenimage.py Show resolved Hide resolved

SamitHuang mentioned this pull request Sep 22, 2025

Update readme #1298

Open

6 tasks

Dong1017 added 2 commits September 26, 2025 16:29

fix a bug of qwen2_5_vl, some revisions suggested from Cui-yshoho and…

c70d315

… SamitHuang

Resolved the conflict regarding qwen2_5_vl masked_scatter-bf16-bug

735e6af

Dong1017 and others added 9 commits September 28, 2025 17:20

Add UTs of transformer, supplement MDs, delete unused code comments

237183e

update md to notice the use of transformers==4.52.1

a498371

fix ci problem

e73acef

Merge branch 'master' into qwenimage

07b10a7

fix ci problem

dbb8ac2

Merge branch 'qwenimage' of https://github.com/Dong1017/mindone into …

d88f7e4

…qwenimage

fix ci problem

427961a

fix ci problem

54af7f1

Merge branch 'master' into qwenimage

4eb6673

vigo999 assigned Dong1017 Sep 29, 2025

vigo999 added the new model add new model to mindone label Sep 29, 2025

vigo999 added this to mindone Sep 29, 2025

vigo999 moved this to In Progress in mindone Sep 29, 2025

Dong1017 and others added 5 commits September 30, 2025 08:56

CHECK: pre-commit run --all-files

0524331

fix ci problem - strange format?

3102629

Trigger CI

faca606

fix ci problem - modeling_reformer

c2508b9

Merge branch 'master' into qwenimage

ad179cd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(diffusers/pipelines): add pipelines and required modules of QwenImage in Diffusers Master #1288

feat(diffusers/pipelines): add pipelines and required modules of QwenImage in Diffusers Master #1288

Uh oh!

Dong1017 commented Sep 17, 2025 •

edited by vigo999

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SamitHuang commented Sep 21, 2025

Uh oh!

Uh oh!

Uh oh!

SamitHuang commented Sep 21, 2025

Uh oh!

Dong1017 commented Sep 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

feat(diffusers/pipelines): add pipelines and required modules of QwenImage in Diffusers Master #1288

Are you sure you want to change the base?

feat(diffusers/pipelines): add pipelines and required modules of QwenImage in Diffusers Master #1288

Uh oh!

Conversation

Dong1017 commented Sep 17, 2025 • edited by vigo999 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Adds

1. QwenImage Pipelines and Required Modules

a. Pipelines

b. Modules

2. add UTs of pipelines

Usage

Performance

Limitation

Notes

Before submitting

Who can review?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SamitHuang commented Sep 21, 2025

Uh oh!

Uh oh!

Uh oh!

SamitHuang commented Sep 21, 2025

Uh oh!

Dong1017 commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Dong1017 commented Sep 17, 2025 •

edited by vigo999

Loading

Dong1017 commented Sep 26, 2025 •

edited

Loading