KvLove commited on
Commit
f4dc67d
·
verified ·
1 Parent(s): a0b698b

Upload 6 files

Browse files

Add LoRa weights + update README.md

README.md CHANGED
@@ -1,3 +1,192 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - Qwen/Qwen-Image-Edit
7
+ pipeline_tag: image-to-image
8
+ tags:
9
+ - lora
10
+ - qwen
11
+ - qwen-image
12
+ - qwen-image-edit
13
+ - image-editing
14
+ - inscene
15
+ - spatial-understanding
16
+ - scene-coherence
17
+ - computer-vision
18
+ - InScene
19
+ ---
20
+
21
+ # Qwen Image Edit Inscene LoRA
22
+
23
+ An open-source LoRA (Low-Rank Adaptation) model for Qwen-Image-Edit that specializes in in-scene image editing by [FlyMy.AI](https://flymy.ai).
24
+
25
+ ## 🌟 About FlyMy.AI
26
+
27
+ Agentic Infra for GenAI. FlyMy.AI is a B2B infrastructure for building and running GenAI Media agents.
28
+
29
+ **🔗 Useful Links:**
30
+ - 🌐 [Official Website](https://flymy.ai)
31
+ - 📚 [Documentation](https://docs.flymy.ai/intro)
32
+ - 💬 [Discord Community](https://discord.com/invite/t6hPBpSebw)
33
+ - 🤗 [LoRA Training Repository](https://github.com/FlyMyAI/flymyai-lora-trainer)
34
+ - 🐦 [X (Twitter)](https://x.com/flymyai)
35
+ - 💼 [LinkedIn](https://linkedin.com/company/flymyai)
36
+ - 📺 [YouTube](https://youtube.com/@flymyai)
37
+ - 📸 [Instagram](https://www.instagram.com/flymy_ai)
38
+
39
+ ---
40
+
41
+ ## 🚀 Features
42
+
43
+ - LoRA-based fine-tuning for efficient in-scene image editing
44
+ - Specialized for Qwen-Image-Edit model
45
+ - Enhanced control over scene composition and object positioning
46
+ - Optimized for maintaining scene coherence during edits
47
+ - Compatible with Hugging Face `diffusers`
48
+ - Control-based image editing with improved spatial understanding
49
+
50
+ ---
51
+
52
+ ## 📦 Installation
53
+
54
+ 1. Install required packages:
55
+ ```bash
56
+ pip install torch torchvision diffusers transformers accelerate
57
+ ```
58
+
59
+ 2. Install the latest `diffusers` from GitHub:
60
+ ```bash
61
+ pip install git+https://github.com/huggingface/diffusers
62
+ ```
63
+
64
+ ---
65
+
66
+ ## 🧪 Usage
67
+
68
+ ### 🔧 Qwen-Image-Edit Initialization
69
+
70
+ ```python
71
+ from diffusers import QwenImageEditPipeline
72
+ import torch
73
+ from PIL import Image
74
+
75
+ # Load the pipeline
76
+ pipeline = QwenImageEditPipeline.from_pretrained("Qwen/Qwen-Image-Edit")
77
+ pipeline.to(torch.bfloat16)
78
+ pipeline.to("cuda")
79
+ ```
80
+
81
+ ### 🔌 Load LoRA Weights
82
+
83
+ ```python
84
+ # Load trained LoRA weights for in-scene editing
85
+ pipeline.load_lora_weights("./flymy_anime_irl.safetensors")
86
+ ```
87
+
88
+ ### 🎨 Edit Image with Qwen-Image-Edit Inscene LoRA
89
+
90
+ ```python
91
+ # Load input image
92
+ image = Image.open("./assets/qie2_input.jpg").convert("RGB")
93
+
94
+ # Define in-scene editing prompt
95
+ prompt = "Make a shot in the same scene of the left hand securing the edge of the cutting board while the right hand tilts it, causing the chopped tomatoes to slide off into the pan, camera angle shifts slightly to the left to center more on the pan."
96
+
97
+ # Generate edited image with enhanced scene understanding
98
+ inputs = {
99
+ "image": image,
100
+ "prompt": prompt,
101
+ "generator": torch.manual_seed(0),
102
+ "true_cfg_scale": 4.0,
103
+ "negative_prompt": " ",
104
+ "num_inference_steps": 50,
105
+ }
106
+
107
+ with torch.inference_mode():
108
+ output = pipeline(**inputs)
109
+ output_image = output.images[0]
110
+ output_image.save("edited_image.png")
111
+ ```
112
+
113
+ ### 🖼️ Sample Output - Qwen-Image-Edit Inscene
114
+
115
+ **Input Image:**
116
+
117
+ ![Input Image](./assets/qie2_input.jpg)
118
+
119
+ **Prompt:**
120
+ "Make a shot in the same scene of the left hand securing the edge of the cutting board while the right hand tilts it, causing the chopped tomatoes to slide off into the pan, camera angle shifts slightly to the left to center more on the pan."
121
+
122
+ **Output without LoRA:**
123
+
124
+ ![Output without LoRA](./assets/qie2_orig.jpg)
125
+
126
+ **Output with Inscene LoRA:**
127
+
128
+ ![Output with LoRA](./assets/qie2_lora.jpg)
129
+
130
+ ---
131
+
132
+ ### Workflow Features
133
+
134
+ - ✅ Pre-configured for Qwen-Image-Edit + Inscene LoRA inference
135
+ - ✅ Optimized settings for in-scene editing quality
136
+ - ✅ Enhanced spatial understanding and scene coherence
137
+ - ✅ Easy prompt and parameter adjustment
138
+ - ✅ Compatible with various input image types
139
+
140
+ ---
141
+
142
+ ## 🎯 What is Inscene LoRA?
143
+
144
+ This LoRA model is specifically trained to enhance Qwen-Image-Edit's ability to perform **in-scene image editing**. It focuses on:
145
+
146
+ - **Scene Coherence**: Maintaining logical spatial relationships within the scene
147
+ - **Object Positioning**: Better understanding of object placement and movement
148
+ - **Camera Perspective**: Improved handling of viewpoint changes and camera movements
149
+ - **Action Sequences**: Enhanced ability to depict sequential actions within the same scene
150
+ - **Contextual Editing**: Preserving scene context while making targeted modifications
151
+
152
+ ---
153
+
154
+ ## 🔧 Training Information
155
+
156
+ This LoRA model was trained using the [FlyMy.AI LoRA Trainer](https://github.com/FlyMyAI/flymyai-lora-trainer) with:
157
+
158
+ - **Base Model**: Qwen/Qwen-Image-Edit
159
+ - **Training Focus**: In-scene image editing and spatial understanding
160
+ - **Dataset**: Curated collection of scene-based editing examples (InScene dataset)
161
+ - **Optimization**: Low-rank adaptation for efficient fine-tuning
162
+
163
+ ---
164
+
165
+ ## 📊 Model Specifications
166
+
167
+ - **Model Type**: LoRA (Low-Rank Adaptation)
168
+ - **Base Model**: Qwen/Qwen-Image-Edit
169
+ - **File Format**: SafeTensors (.safetensors)
170
+ - **Specialization**: In-scene image editing
171
+ - **Training Framework**: Diffusers + Accelerate
172
+ - **Memory Efficient**: Optimized for consumer GPUs
173
+
174
+ ---
175
+
176
+ ## 🤝 Support
177
+
178
+ If you have questions or suggestions, join our community:
179
+
180
+ - 🌐 [FlyMy.AI](https://flymy.ai)
181
+ - 💬 [Discord Community](https://discord.com/invite/t6hPBpSebw)
182
+ - 🐦 [Follow us on X](https://x.com/flymyai)
183
+ - 💼 [Connect on LinkedIn](https://linkedin.com/company/flymyai)
184
+ - 📧 [Support](mailto:[email protected])
185
+
186
+ **⭐ Don't forget to star the repository if you like it!**
187
+
188
+ ---
189
+
190
+ ## 📄 License
191
+
192
+ This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
assets/qie2_input.jpg ADDED
assets/qie2_lora.jpg ADDED
assets/qie2_orig.jpg ADDED
flymy_qwen_image_edit_inscene_lora.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd902c211307b92b71b957855634ddb871baa64aad9bb58f8819122ca3935a8f
3
+ size 47249496