File size: 11,592 Bytes
cdf4769 1fedb07 cdf4769 |
|
---
license: creativeml-openrail-m
base_model: runwayml/stable-diffusion-v1-5
library_name: onnx
tags:
- stable-diffusion
- text-to-image
- diffusion
- webgpu
- browser-ai
- onnx
- zhare-ai
- client-side
- privacy-preserving
pipeline_tag: text-to-image
inference: false
widget:
- text: "A beautiful sunset over mountains, digital art style"
example_title: "Mountain Sunset"
- text: "A futuristic cityscape with flying cars at night, cyberpunk"
example_title: "Cyberpunk City"
- text: "A serene lake surrounded by autumn trees, oil painting"
example_title: "Autumn Lake"
- text: "Portrait of a wise elderly person, studio lighting, photorealistic"
example_title: "Portrait"
model-index:
- name: sd-1-5-webgpu
results:
- task:
type: text-to-image
name: Text-to-Image Generation
dataset:
name: Browser Performance Benchmark
type: webgpu-inference
metrics:
- type: generation-time
value: 3-45
name: Generation Time (seconds)
config: 512x512, 20 steps, various hardware
- type: memory-usage
value: 4-6
name: VRAM Usage (GB)
config: WebGPU acceleration
- type: model-size
value: 3.5
name: Total Model Size (GB)
config: All ONNX components
---
<div align="center">
<img src="zhare-logo.png" alt="Zhare-AI Logo" width="200" height="auto" style="margin-bottom: 20px;">
</div>
# Stable Diffusion 1.5 WebGPU by Zhare-AI
<div align="center">




**Privacy-preserving text-to-image generation in your browser with WebGPU acceleration**
</div>
This is a browser-optimized implementation of Stable Diffusion v1.5, specifically converted and optimized for client-side deployment using WebGPU acceleration. Developed by **Zhare-AI**, this model enables high-quality image generation directly in web browsers without requiring server infrastructure, ensuring complete user privacy and data sovereignty.
<div align="center">
<img src="zhare-logo.png" alt="Zhare-AI - Democratizing AI" width="150" height="auto">
<p><em>Democratizing AI through distributed computing and privacy-preserving technology</em></p>
</div>
## 🌟 Key Features
- 🌐 **Fully Client-Side**: Complete image generation in the browser, no data leaves your device
- ⚡ **WebGPU Accelerated**: Hardware-accelerated inference with automatic WebAssembly fallback
- 🔒 **Privacy-First**: All processing happens locally, protecting user prompts and generated content
- 📱 **Cross-Platform**: Compatible with desktop and mobile browsers
- 🛠️ **Production-Ready**: Optimized for real-world web applications
## 🚀 Quick Start
### Installation & Setup
```bash
# Clone or download the model
git lfs install
git clone https://huggingface.co/Zhare-AI/sd-1-5-webgpu
```
## 📊 Performance Specifications
### Model Architecture
| Component | Description | Approximate Size |
|-----------|-------------|------------------|
| **Text Encoder** | CLIP ViT-L/14 for text understanding | ~500MB |
| **UNet** | Core diffusion model for image generation | ~3.4GB |
| **VAE Decoder** | Converts latents to final images | ~160MB |
| **VAE Encoder** | Encodes images to latent space | ~160MB |
| **Safety Checker** | Content filtering (optional) | ~600MB |
**Total Model Size**: ~4.8GB (without safety checker: ~4.2GB)
### Browser Performance Benchmarks
*Generation time for 512×512 images with 20 inference steps:*
| Hardware Category | Example Device | Typical Performance |
|------------------|----------------|-------------------|
| **High-End Desktop** | RTX 4090, RTX 4080 | 3-8 seconds |
| **Gaming Desktop** | RTX 3080, RTX 3070 | 8-15 seconds |
| **Intel Arc GPUs** | Arc A750, Arc A770 | 8-15 seconds |
| **AMD High-End** | RX 7900 XT/XTX | 6-12 seconds |
| **Apple Silicon** | M2 Max, M1 Ultra | 10-20 seconds |
| **Integrated GPUs** | Intel Iris Xe | 25-50 seconds |
| **WebAssembly Fallback** | CPU-only devices | 2-10 minutes |
### System Requirements
- **Minimum VRAM**: 4GB (recommended: 6GB+)
- **System RAM**: 8GB minimum, 16GB recommended
- **Storage**: 5GB free space for model files
- **Browser**: Chrome 113+, Edge 113+ (WebGPU), or any modern browser (WebAssembly fallback)
## 🌐 Browser Compatibility
| Browser | WebGPU Support | Performance Level | Notes |
|---------|---------------|------------------|-------|
| **Chrome 113+** | ✅ Full Support | Excellent | Primary recommendation |
| **Microsoft Edge 113+** | ✅ Full Support | Excellent | Primary recommendation |
| **Firefox 141+** | ✅ Stable Support | Very Good | Recent WebGPU implementation |
| **Safari 17.4+** | 🔶 Experimental | Good | Behind feature flag |
| **Mobile Chrome 121+** | 🔶 Limited | Fair | Android only, limited memory |
*All browsers support WebAssembly fallback for universal compatibility*
## 📝 Model Details
### Training Information
This model is based on Stable Diffusion v1.5 with the following training characteristics:
- **Base Dataset**: LAION-5B filtered subset (~590M image-text pairs)
- **Training Resolution**: 512×512 pixels
- **Architecture**: Latent Diffusion Model with CLIP ViT-L/14 text encoder
- **Precision**: Originally trained in FP32, optimized to FP16 for browser deployment
### Optimization for Web Deployment
- **ONNX Conversion**: Optimized computational graph for web inference
- **WebGPU Kernels**: Custom compute shaders for GPU acceleration
- **Memory Efficiency**: Attention slicing and dynamic memory management
- **Cross-Platform**: WebAssembly fallback ensures universal browser support
## 🛡️ Ethical Use and Safety
### Built-in Safety Features
- **Content Filter**: Optional NSFW detection and filtering
- **Prompt Sanitization**: Basic filtering of potentially harmful prompts
- **Local Processing**: No data transmission ensures privacy protection
### Responsible Use Guidelines
✅ **Encouraged Uses:**
- Creative art and design projects
- Educational demonstrations of AI capabilities
- Rapid prototyping for applications
- Personal creative exploration
- Research and development
❌ **Prohibited Uses:**
- Creating harmful, offensive, or illegal content
- Generating misleading information or deepfakes
- Violating copyright or intellectual property rights
- Any use that violates the CreativeML OpenRAIL-M license terms
### Privacy and Data Protection
- **Zero Data Collection**: All processing occurs locally in your browser
- **No Server Communication**: Model runs entirely offline after initial download
- **User Control**: Complete control over generated content and prompts
- **GDPR Compliant**: No personal data processing or storage
## ⚠️ Limitations and Considerations
### Technical Limitations
- **Resolution**: Optimized for 512×512 (other resolutions may reduce quality)
- **Batch Size**: Single image generation only in browser environment
- **Memory Constraints**: Limited by browser and device VRAM/RAM
- **Generation Speed**: Slower than dedicated server hardware
### Content Limitations
- **Language Bias**: Best performance with English prompts
- **Cultural Representation**: Training data may reflect Western/English-speaking biases
- **Artistic Style**: Tendency toward photorealistic and digital art styles
- **Consistency**: Multiple generations from same prompt may vary significantly
### Browser-Specific Considerations
- **WebGPU Availability**: Limited to supporting browsers and devices
- **Memory Management**: Browser security limits may affect large model loading
- **Performance Variance**: Significant variation across different devices and browsers
## 📜 License: CreativeML OpenRAIL-M
This model is released under the **CreativeML OpenRAIL-M** license, which allows for:
✅ **Permitted:**
- Commercial and non-commercial use
- Distribution and modification
- Creation of derivative works
- Integration into applications and services
🚫 **Restrictions:**
- Must not be used to generate harmful content
- Cannot be used for illegal activities
- Must include license terms in any distribution
- Derivative works must maintain the same license restrictions
**Full License Text**: Available at [CreativeML OpenRAIL-M License](https://huggingface.co/spaces/CompVis/stable-diffusion-license)
### License Compliance
When using this model:
1. **Include License**: Provide license terms to end users
2. **Respect Restrictions**: Ensure use cases comply with content restrictions
3. **Derivative Works**: Apply same license to modified versions
4. **Attribution**: Credit original Stable Diffusion creators and Zhare-AI adaptation
## 🏢 About Zhare-AI
<div align="center">
<img src="zhare-logo.png" alt="Zhare-AI" width="120" height="auto" style="margin: 20px 0;">
</div>
**Zhare-AI** is focused on democratizing AI technology by making powerful models accessible directly in web browsers. Our mission is to enable privacy-preserving AI applications that put users in control of their data and creative processes.
- **Website**: [zhare.ai](https://zhare.ai)
- **Focus**: Distributed AI computing and browser-based AI applications
- **Philosophy**: Privacy-first, user-controlled AI experiences
- **Vision**: Making AI accessible, private, and distributed
### Our Mission
We believe AI should be:
- **Accessible** to everyone, regardless of infrastructure
- **Private** with complete user data control
- **Distributed** across devices rather than centralized servers
- **Transparent** with open-source implementations
## 📚 Citation and References
### Cite This Work
```bibtex
@misc{zhare-ai-sd15-webgpu-2025,
title={Stable Diffusion 1.5 WebGPU: Browser-Optimized Text-to-Image Generation},
author={Zhare-AI},
year={2025},
howpublished={\url{https://huggingface.co/Zhare-AI/sd-1-5-webgpu}},
note={WebGPU-optimized implementation for privacy-preserving browser-based image generation}
}
```
### Original Stable Diffusion Citation
```bibtex
@InProceedings{Rombach_2022_CVPR,
author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Björn},
title = {High-Resolution Image Synthesis With Latent Diffusion Models},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022},
pages = {10684-10695}
}
```
## 🤝 Community and Support
### Getting Help
- **Issues**: Report technical problems via the repository issues
- **Discussions**: Join the community discussion for tips and examples
- **Documentation**: Comprehensive guides available in the repository
### Contributing
We welcome contributions to improve browser compatibility, performance, and user experience:
- Performance optimizations for different hardware
- Browser compatibility improvements
- Documentation enhancements
- Example applications and tutorials
---
<div align="center">
<img src="zhare-logo.png" alt="Zhare-AI" width="100" height="auto">
**🚀 Ready to create amazing images directly in your browser?**
*This model brings the power of Stable Diffusion to web applications while keeping your data completely private and secure.*
**Developed with ❤️ by Zhare-AI for the open-source community**
[🌐 Visit Zhare.ai](https://zhare.ai) | [📧 Contact Us](mailto:[email protected]) | [💬 Join Discussion](https://github.com/Zhare-AI)
</div> |