File size: 11,592 Bytes
cdf4769
 
 
1fedb07
cdf4769
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
---
license: creativeml-openrail-m
base_model: runwayml/stable-diffusion-v1-5
library_name: onnx
tags:
- stable-diffusion
- text-to-image
- diffusion
- webgpu
- browser-ai
- onnx
- zhare-ai
- client-side
- privacy-preserving
pipeline_tag: text-to-image
inference: false
widget:
- text: "A beautiful sunset over mountains, digital art style"
  example_title: "Mountain Sunset"
- text: "A futuristic cityscape with flying cars at night, cyberpunk"
  example_title: "Cyberpunk City"
- text: "A serene lake surrounded by autumn trees, oil painting"
  example_title: "Autumn Lake"
- text: "Portrait of a wise elderly person, studio lighting, photorealistic"
  example_title: "Portrait"
model-index:
- name: sd-1-5-webgpu
  results:
  - task:
      type: text-to-image
      name: Text-to-Image Generation
    dataset:
      name: Browser Performance Benchmark
      type: webgpu-inference
    metrics:
    - type: generation-time
      value: 3-45
      name: Generation Time (seconds)
      config: 512x512, 20 steps, various hardware
    - type: memory-usage
      value: 4-6
      name: VRAM Usage (GB)
      config: WebGPU acceleration
    - type: model-size
      value: 3.5
      name: Total Model Size (GB)
      config: All ONNX components
---

<div align="center">
  <img src="zhare-logo.png" alt="Zhare-AI Logo" width="200" height="auto" style="margin-bottom: 20px;">
</div>

# Stable Diffusion 1.5 WebGPU by Zhare-AI

<div align="center">
  
![License](https://img.shields.io/badge/License-CreativeML_OpenRAIL--M-blue.svg)
![WebGPU](https://img.shields.io/badge/WebGPU-Ready-green)
![Privacy](https://img.shields.io/badge/Privacy-First-purple)
![Production](https://img.shields.io/badge/Production-Ready-brightgreen)

**Privacy-preserving text-to-image generation in your browser with WebGPU acceleration**

</div>

This is a browser-optimized implementation of Stable Diffusion v1.5, specifically converted and optimized for client-side deployment using WebGPU acceleration. Developed by **Zhare-AI**, this model enables high-quality image generation directly in web browsers without requiring server infrastructure, ensuring complete user privacy and data sovereignty.

<div align="center">
  <img src="zhare-logo.png" alt="Zhare-AI - Democratizing AI" width="150" height="auto">
  <p><em>Democratizing AI through distributed computing and privacy-preserving technology</em></p>
</div>

## 🌟 Key Features

- 🌐 **Fully Client-Side**: Complete image generation in the browser, no data leaves your device
-**WebGPU Accelerated**: Hardware-accelerated inference with automatic WebAssembly fallback  
- 🔒 **Privacy-First**: All processing happens locally, protecting user prompts and generated content
- 📱 **Cross-Platform**: Compatible with desktop and mobile browsers
- 🛠️ **Production-Ready**: Optimized for real-world web applications

## 🚀 Quick Start

### Installation & Setup

```bash
# Clone or download the model
git lfs install
git clone https://huggingface.co/Zhare-AI/sd-1-5-webgpu
```

## 📊 Performance Specifications

### Model Architecture

| Component | Description | Approximate Size |
|-----------|-------------|------------------|
| **Text Encoder** | CLIP ViT-L/14 for text understanding | ~500MB |
| **UNet** | Core diffusion model for image generation | ~3.4GB |
| **VAE Decoder** | Converts latents to final images | ~160MB |
| **VAE Encoder** | Encodes images to latent space | ~160MB |
| **Safety Checker** | Content filtering (optional) | ~600MB |

**Total Model Size**: ~4.8GB (without safety checker: ~4.2GB)

### Browser Performance Benchmarks

*Generation time for 512×512 images with 20 inference steps:*

| Hardware Category | Example Device | Typical Performance |
|------------------|----------------|-------------------|
| **High-End Desktop** | RTX 4090, RTX 4080 | 3-8 seconds |
| **Gaming Desktop** | RTX 3080, RTX 3070 | 8-15 seconds |
| **Intel Arc GPUs** | Arc A750, Arc A770 | 8-15 seconds |
| **AMD High-End** | RX 7900 XT/XTX | 6-12 seconds |
| **Apple Silicon** | M2 Max, M1 Ultra | 10-20 seconds |
| **Integrated GPUs** | Intel Iris Xe | 25-50 seconds |
| **WebAssembly Fallback** | CPU-only devices | 2-10 minutes |

### System Requirements

- **Minimum VRAM**: 4GB (recommended: 6GB+)
- **System RAM**: 8GB minimum, 16GB recommended
- **Storage**: 5GB free space for model files
- **Browser**: Chrome 113+, Edge 113+ (WebGPU), or any modern browser (WebAssembly fallback)

## 🌐 Browser Compatibility

| Browser | WebGPU Support | Performance Level | Notes |
|---------|---------------|------------------|-------|
| **Chrome 113+** | ✅ Full Support | Excellent | Primary recommendation |
| **Microsoft Edge 113+** | ✅ Full Support | Excellent | Primary recommendation |
| **Firefox 141+** | ✅ Stable Support | Very Good | Recent WebGPU implementation |
| **Safari 17.4+** | 🔶 Experimental | Good | Behind feature flag |
| **Mobile Chrome 121+** | 🔶 Limited | Fair | Android only, limited memory |

*All browsers support WebAssembly fallback for universal compatibility*

## 📝 Model Details

### Training Information

This model is based on Stable Diffusion v1.5 with the following training characteristics:

- **Base Dataset**: LAION-5B filtered subset (~590M image-text pairs)
- **Training Resolution**: 512×512 pixels
- **Architecture**: Latent Diffusion Model with CLIP ViT-L/14 text encoder
- **Precision**: Originally trained in FP32, optimized to FP16 for browser deployment

### Optimization for Web Deployment

- **ONNX Conversion**: Optimized computational graph for web inference
- **WebGPU Kernels**: Custom compute shaders for GPU acceleration
- **Memory Efficiency**: Attention slicing and dynamic memory management
- **Cross-Platform**: WebAssembly fallback ensures universal browser support

## 🛡️ Ethical Use and Safety

### Built-in Safety Features

- **Content Filter**: Optional NSFW detection and filtering
- **Prompt Sanitization**: Basic filtering of potentially harmful prompts
- **Local Processing**: No data transmission ensures privacy protection

### Responsible Use Guidelines**Encouraged Uses:**
- Creative art and design projects
- Educational demonstrations of AI capabilities
- Rapid prototyping for applications
- Personal creative exploration
- Research and development

❌ **Prohibited Uses:**
- Creating harmful, offensive, or illegal content
- Generating misleading information or deepfakes
- Violating copyright or intellectual property rights
- Any use that violates the CreativeML OpenRAIL-M license terms

### Privacy and Data Protection

- **Zero Data Collection**: All processing occurs locally in your browser
- **No Server Communication**: Model runs entirely offline after initial download
- **User Control**: Complete control over generated content and prompts
- **GDPR Compliant**: No personal data processing or storage

## ⚠️ Limitations and Considerations

### Technical Limitations

- **Resolution**: Optimized for 512×512 (other resolutions may reduce quality)
- **Batch Size**: Single image generation only in browser environment
- **Memory Constraints**: Limited by browser and device VRAM/RAM
- **Generation Speed**: Slower than dedicated server hardware

### Content Limitations

- **Language Bias**: Best performance with English prompts
- **Cultural Representation**: Training data may reflect Western/English-speaking biases
- **Artistic Style**: Tendency toward photorealistic and digital art styles
- **Consistency**: Multiple generations from same prompt may vary significantly

### Browser-Specific Considerations

- **WebGPU Availability**: Limited to supporting browsers and devices
- **Memory Management**: Browser security limits may affect large model loading
- **Performance Variance**: Significant variation across different devices and browsers

## 📜 License: CreativeML OpenRAIL-M

This model is released under the **CreativeML OpenRAIL-M** license, which allows for:

✅ **Permitted:**
- Commercial and non-commercial use
- Distribution and modification
- Creation of derivative works
- Integration into applications and services

🚫 **Restrictions:**
- Must not be used to generate harmful content
- Cannot be used for illegal activities
- Must include license terms in any distribution
- Derivative works must maintain the same license restrictions

**Full License Text**: Available at [CreativeML OpenRAIL-M License](https://huggingface.co/spaces/CompVis/stable-diffusion-license)

### License Compliance

When using this model:
1. **Include License**: Provide license terms to end users
2. **Respect Restrictions**: Ensure use cases comply with content restrictions
3. **Derivative Works**: Apply same license to modified versions
4. **Attribution**: Credit original Stable Diffusion creators and Zhare-AI adaptation

## 🏢 About Zhare-AI

<div align="center">
  <img src="zhare-logo.png" alt="Zhare-AI" width="120" height="auto" style="margin: 20px 0;">
</div>

**Zhare-AI** is focused on democratizing AI technology by making powerful models accessible directly in web browsers. Our mission is to enable privacy-preserving AI applications that put users in control of their data and creative processes.

- **Website**: [zhare.ai](https://zhare.ai)
- **Focus**: Distributed AI computing and browser-based AI applications
- **Philosophy**: Privacy-first, user-controlled AI experiences
- **Vision**: Making AI accessible, private, and distributed

### Our Mission

We believe AI should be:
- **Accessible** to everyone, regardless of infrastructure
- **Private** with complete user data control
- **Distributed** across devices rather than centralized servers
- **Transparent** with open-source implementations

## 📚 Citation and References

### Cite This Work

```bibtex
@misc{zhare-ai-sd15-webgpu-2025,
  title={Stable Diffusion 1.5 WebGPU: Browser-Optimized Text-to-Image Generation},
  author={Zhare-AI},
  year={2025},
  howpublished={\url{https://huggingface.co/Zhare-AI/sd-1-5-webgpu}},
  note={WebGPU-optimized implementation for privacy-preserving browser-based image generation}
}
```

### Original Stable Diffusion Citation

```bibtex
@InProceedings{Rombach_2022_CVPR,
  author    = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Björn},
  title     = {High-Resolution Image Synthesis With Latent Diffusion Models},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2022},
  pages     = {10684-10695}
}
```

## 🤝 Community and Support

### Getting Help

- **Issues**: Report technical problems via the repository issues
- **Discussions**: Join the community discussion for tips and examples
- **Documentation**: Comprehensive guides available in the repository

### Contributing

We welcome contributions to improve browser compatibility, performance, and user experience:

- Performance optimizations for different hardware
- Browser compatibility improvements  
- Documentation enhancements
- Example applications and tutorials

---

<div align="center">
  <img src="zhare-logo.png" alt="Zhare-AI" width="100" height="auto">
  
**🚀 Ready to create amazing images directly in your browser?**

*This model brings the power of Stable Diffusion to web applications while keeping your data completely private and secure.*

**Developed with ❤️ by Zhare-AI for the open-source community**

[🌐 Visit Zhare.ai](https://zhare.ai) | [📧 Contact Us](mailto:[email protected]) | [💬 Join Discussion](https://github.com/Zhare-AI)

</div>