nielsr HF Staff commited on
Commit
5c1476c
·
verified ·
1 Parent(s): 8eeb72d

Enhance model card with pipeline tag, paper, project, and code links

Browse files

This PR significantly enhances the model card for the MagicQuill V2 model by adding comprehensive details from the project's GitHub repository.

It includes:
- The `pipeline_tag: image-to-image` in the metadata, enabling better discoverability on the Hugging Face Hub through relevant filters.
- Direct links to the paper ([Hugging Face Papers](https://huggingface.co/papers/2512.03046)), the project page (https://magicquill.art/v2/), the GitHub repository (https://github.com/zliucz/MagicQuillV2), and a Hugging Face Spaces demo (https://huggingface.co/spaces/AI4Editing/MagicQuillV2).
- A concise abstract (TLDR), a video demonstration, system overview, hardware requirements, and detailed setup instructions to guide users on how to run the system locally.
- The BibTeX citation for proper attribution.

Please review and merge this PR if everything looks good!

Files changed (1) hide show
  1. README.md +87 -3
README.md CHANGED
@@ -1,3 +1,87 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ pipeline_tag: image-to-image
4
+ ---
5
+
6
+ # 🪶 MagicQuill V2: Precise and Interactive Image Editing with Layered Visual Cues
7
+
8
+ - **Paper:** [MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues](https://huggingface.co/papers/2512.03046)
9
+ - **Project Page:** https://magicquill.art/v2/
10
+ - **Code Repository:** https://github.com/zliucz/MagicQuillV2
11
+ - **Hugging Face Spaces Demo:** https://huggingface.co/spaces/AI4Editing/MagicQuillV2
12
+
13
+ <br>
14
+
15
+ <div align="center">
16
+ <video src="https://github.com/user-attachments/assets/58079152-7729-48ed-9bb4-0ddfd1873dd0" width="100%" controls autoplay muted loop></video>
17
+ </div>
18
+
19
+ <br>
20
+
21
+ **TLDR:** MagicQuill V2 introduces a layered composition paradigm to generative image editing, disentangling creative intent into controllable visual cues (Content, Spatial, Structural, Color) for precise and intuitive control.
22
+
23
+ ## Hardware Requirements
24
+
25
+ Our model is based on Flux Kontext, which is large and computationally intensive.
26
+ - **VRAM**: Approximately **40GB** of VRAM is required for inference.
27
+ - **Speed**: It takes about **30 seconds** to generate a single image.
28
+
29
+ > **Important**: This is a research project focused on pushing the boundaries of interactive image editing. If you do not have sufficient GPU memory, we recommend checking out our [**MagicQuill V1**](https://github.com/ant-research/MagicQuill) or trying the online demo on [**Hugging Face Spaces**](https://huggingface.co/spaces/AI4Editing/MagicQuillV2).
30
+
31
+ ## Setup
32
+
33
+ 1. **Clone the repository**
34
+ ```bash
35
+ git clone https://github.com/magic-quill/MagicQuillV2.git
36
+ cd MagicQuillV2
37
+ ```
38
+
39
+ 2. **Create environment**
40
+ ```bash
41
+ conda create -n MagicQuillV2 python=3.10 -y
42
+ conda activate MagicQuillV2
43
+ ```
44
+
45
+ 3. **Install dependencies**
46
+ ```bash
47
+ pip install -r requirements.txt
48
+ ```
49
+
50
+ 4. **Download models**
51
+ Download the models from [Hugging Face](https://huggingface.co/LiuZichen/MagicQuillV2-models) and place them in the `models/` directory.
52
+
53
+ ```bash
54
+ huggingface-cli download LiuZichen/MagicQuillV2-models --local-dir models
55
+ ```
56
+
57
+ 5. **Run the demo**
58
+ ```bash
59
+ python app.py
60
+ ```
61
+
62
+ ## System Overview
63
+
64
+ The MagicQuill V2 interactive system is designed to unify our layered composition framework.
65
+
66
+ <div align="center">
67
+ <img src="https://github.com/zliucz/MagicQuillV2/raw/main/assets/V2_UI.png" alt="MagicQuill V2 UI" width="100%">
68
+ </div>
69
+
70
+ ### Key Upgrades from V1
71
+
72
+ 1. **Toolbar (A)**: Features a new **Local Edit Brush** for defining the target editing area, along with tools for sketching edges and applying color.
73
+ 2. **Visual Cue Manager (B)**: Holds all content layer visual cues (**foreground props**) that users can drag onto the canvas to define what to generate.
74
+ 3. **Image Segmentation Panel (C)**: Accessed via the segment icon, this panel allows precise object extraction using SAM (Segment Anything Model) with positive/negative dots or bounding boxes.
75
+
76
+ ## Citation
77
+
78
+ If you find MagicQuill V2 useful for your research, please cite our paper:
79
+
80
+ ```bibtex
81
+ @article{liu2025magicquillv2,
82
+ title={MagicQuill V2: Precise and Interactive Image Editing with Layered Visual Cues},
83
+ author={Zichen Liu, Yue Yu, Hao Ouyang, Qiuyu Wang, Shuailei Ma, Ka Leong Cheng, Wen Wang, Qingyan Bai, Yuxuan Zhang, Yanhong Zeng, Yixuan Li, Xing Zhu, Yujun Shen, Qifeng Chen},
84
+ journal={arXiv:2512.03046},
85
+ year={2025}
86
+ }
87
+ ```