Model Overview
This model is a fine-tuned Denoising Diffusion Probabilistic Model (DDPM) for generating images of flowers using the Oxford Flowers dataset. It builds upon the pretrained google/ddpm-cifar10-32 model and is optimized for training on a GPU.
Model Details
Architecture: UNet2DModel
Noise Scheduler: DDPMScheduler
Training Data: Oxford Flowers dataset (nelorth/oxford-flowers)
Optimizer: AdamW
Learning Rate: 1e-4, adjusted using a cosine scheduler
Training Steps: 100 epochs
Batch Size: 64
Image Size: 32x32 pixels
Training Configuration
The training process involves the following steps:
Data Preprocessing:
Images resized to 32x32.
Random horizontal flipping applied for augmentation.
Normalized to the range [-1, 1].
Noise Addition:
Random noise added to images using a linear beta schedule.
Model Training:
The UNet model predicts the noise added to images.
The Mean Squared Error (MSE) loss is used.
The learning rate is adjusted with a cosine scheduler.
Checkpointing:
Model checkpoints are saved every 1000 steps.
Usage
Once trained, the model can be used for generating images of flowers. The trained model is saved as a DDPMPipeline and can be loaded for inference.
Model Inference
from optimum.intel.openvino import OVModelForImageGeneration
pipeline = OVModelForImageGeneration.from_pretrained("flower_diffusion_quantized", export=True)
images = pipeline(batch_size=4, num_inference_steps=50).images
images[0].show()
Model Variants
FP32 Version: Standard precision model.
FP16 Version: Reduced precision for lower memory usage.
Limitations and Considerations
Image Resolution: Trained at 32x32, which may limit the fine details.
Computational Requirements: A GPU is recommended for inference.
Dataset Bias: The model is trained solely on Oxford Flowers, so its generalization to other datasets is limited.
Quantized Model Accuracy: INT8 quantization may slightly reduce output quality but speeds up inference.
- Downloads last month
- 16