The landscape of generative AI is evolving at a breakneck pace. While we've seen incredible progress in image creation, the next frontier is precise, controllable editing. The challenge lies in creating a model that can make complex changes without corrupting the original image's integrity.
Today, we are thrilled to introduce Qwen-Image-Edit-Plus, a 20B MMDiT model built on the foundation of Qwen-Image. This model is engineered to be a powerful tool for next-generation image editing, bringing a new level of precision and flexibility to the table.
Why Qwen-Image-Edit-Plus is a Game Changer
Existing models often fall into one of two camps: they're great at creating but poor at editing, or they offer limited editing capabilities. Qwen-Image-Edit-Plus was developed to bridge this gap, focusing on controlled, high-fidelity edits that are both precise and creative.
Key Advantages:
- Precision and Control: Unlike many models that make broad, unpredictable changes, Qwen-Image-Edit-Plus offers granular control over edits, whether it's a subtle tweak or a major transformation.
- Bilingual Text Proficiency: The model handles both Chinese and English text editing natively, a crucial feature for global applications.
- Superior Performance: Rigorous evaluation on public benchmarks has shown that Qwen-Image-Edit-Plus achieves State-of-the-Art (SOTA) results, proving its effectiveness and reliability.
Semantic vs. Appearance Editing: The Core Difference
A key innovation of this model is its dual approach to image editing.
-
Appearance-Level Editing: This is for surgical, low-level changes. Think of it as inpainting on a professional level. The goal is to make a specific change while keeping all other regions of the image absolutely untouched. This is ideal for tasks like removing an object, adding a new element, or correcting a flaw.
-
Semantic-Level Editing: This is for high-level, creative transformations. The model is allowed to make broader pixel changes across the image as long as it preserves the core meaning and content. This is perfect for style transfer, object rotation, or IP creation where the final output is stylistically different but semantically consistent with the original.
How to Use It: Practical Examples and Prompts
The beauty of Qwen-Image-Edit-Plus lies in its intuitive, prompt-based interface. Here are some examples of what you can do and the prompts you would use.
Example 1: Object Removal (Appearance-Level)
Original Image: A scenic photo of a beach with a person in the foreground.
Goal: Remove the person from the image.
Prompt:
Remove the person from the beach.
Result: The person is completely erased, and the model seamlessly fills the area with sand and water, perfectly matching the surrounding environment without affecting the sky, waves, or other parts of the photo.
Example 2: Text Editing (Bilingual)
Original Image: A storefront sign that reads "Coffee Shop".
Goal: Change the sign to "Bookstore" and add Chinese text below it.
Prompt:
Change the sign to "Bookstore" and add the Chinese text "书店" below it. Keep the original font and style.
Result: The text "Coffee Shop" is replaced with "Bookstore" and "书店". The new text maintains the original font, size, color, and perspective, making the edit look completely natural.
Example 3: Style Transfer (Semantic-Level)
Original Image: A realistic photograph of a cityscape at night.
Goal: Transform the image into a Vincent van Gogh-inspired style.
Prompt:
Change the style of the image to that of Vincent van Gogh, with a starry night feel.
Result: The entire image is transformed into an impressionistic, brush-stroked masterpiece, complete with swirly textures and vibrant colors, while the underlying city structure and lights remain clearly recognizable.
Conclusion: The Future of Editing is Here
Qwen-Image-Edit-Plus is more than just a new model; it's a paradigm shift. It empowers creators by democratizing complex editing tasks. By offering unparalleled control and precision through simple prompts, it promises to be a fundamental building block for the next generation of creative tools and applications.
We believe this model will become an indispensable asset for developers, designers, and enthusiasts alike. What will you create with it?
記事を共有
公開日
September 23, 2025
予想読了時間
約 5 分
文字統計
約 1200 文字