Text-to-Image
Diffusers
English
Harshit Agarwal commited on
Commit
c82448a
·
2 Parent(s): 42683a1 fa31b3f

Merge branch 'main' of https://huggingface.co/aharshit123456/learn_ddpm

Browse files
Files changed (1) hide show
  1. README.md +106 -98
README.md CHANGED
@@ -1,98 +1,106 @@
1
- # DDPM Project
2
-
3
- This repository contains the implementation of Denoising Diffusion Probabilistic Models (DDPM).
4
-
5
- ## Table of Contents
6
- - [Introduction](#introduction)
7
- - [Installation](#installation)
8
- - [Usage](#usage)
9
- - [Contributing](#contributing)
10
-
11
- ## Introduction
12
- Denoising Diffusion Probabilistic Models (DDPM) are a class of generative models that learn to generate data by reversing a diffusion process. This repository provides a comprehensive implementation of DDPM.
13
-
14
- ## Installation
15
- To install the necessary dependencies, run:
16
- ```bash
17
- pip install -r requirements.txt
18
- ```
19
-
20
- ## Usage
21
- To train the model, use the following command:
22
- ```bash
23
- python train.py
24
- ```
25
- To generate samples, use:
26
- ```bash
27
- python generate.py
28
- ```
29
-
30
- ## Game
31
- To understand the model and it's workings, we're working on a cool cute little game where the user is the UNET reverser/diffusion model and is tasked to denoise the images with noise made of grids of lines.
32
-
33
- Use [learndiffusion.vercel.app](learndiffusion.vercel.app) to access the primitive version of the game. You can also contribute to the game by checking out at the diffusion_game branch. A new model showcase will also be added such that the model's weights are loaded from the internet, model's files are installed and loaded into a gradio interface for direct use/inference on the vercel. Feel free to make changes for the same, issue is opened.
34
-
35
- ## Explanations and Mathematics
36
- - slides from presentation :
37
- - notes/explanations : [HERE](slides\notes)
38
- - a cute lab talk ppt:
39
- - plato's allegory : \<link to REPUBLIC>
40
-
41
- ## Resources
42
- - Original Paper : https://arxiv.org/pdf/2006.11239
43
- - Improvement Paper : https://arxiv.org/abs/2102.09672
44
- - Improvement by OpenAI : https://arxiv.org/pdf/2105.05233
45
- - Stable Diffusion Paper : https://arxiv.org/abs/2112.10752
46
- -
47
-
48
- ### Papers for background
49
- - UNET Paper for Biomedical Segmentation
50
- - Autoencooder
51
- - Variational Autoencoder
52
- - Markov Hierarchical VAE
53
- - Introductory Lectures on Diffusion Process
54
-
55
- ### Youtube videos and courses
56
- #### Mathematics
57
- - Outliers
58
- - Omar Jahil
59
-
60
- #### Pytorch Implementation
61
- - [Deep Findr](https://www.youtube.com/watch?v=a4Yfz2FxXiY)
62
- - [Notebook from Deep Findr](https://colab.research.google.com/drive/1sjy9odlSSy0RBVgMTgP7s99NXsqglsUL?usp=sharing)
63
-
64
- ## Pretrained Weights
65
- weights from the model can be found in [pretrained_weights](https://drive.google.com/drive/folders/1NiQDI3e67I9FITVnrzNPP2Az0LABRpic?usp=sharing)
66
-
67
- For loading the pretrained weights:
68
- ```
69
- model2 = SimpleUnet()
70
- model2.load_state_dict(torch.load("/content/drive/MyDrive/Research Work/mlsa/DDPM/model_weights.pth"))
71
- model2.eval()
72
- ```
73
-
74
- For making inferences
75
- TODO: Errors in the sampling function, boolean errors and etc. Will open issues for solving by others as exercise if needed.
76
- ```
77
- num_samples = 8 # Number of images to generate
78
- image_size = (3, 32, 32) # Example for CIFAR10
79
- noise = torch.randn(num_samples, *image_size).to("cuda")
80
-
81
- model2.to("cuda")
82
- # Generate images by denoising
83
- with torch.no_grad():
84
- generated_images = model2.sample(noise)
85
-
86
- # Save the generated images
87
- save_image(generated_images, "generated_images.png", nrow=4, normalize=True)
88
- ```
89
-
90
-
91
- ## Contributing
92
- Contributions are welcome! Please open an issue or submit a pull request.
93
-
94
-
95
- ## Future Ideas
96
- - Make the model onnx compatible for training and inferencing on Intel GPUs
97
- - Build a Stable Diffusion model Text2Img using CLIP implementationnnnn !!!
98
- - Train the current model for a much larger dataset with more generalizations and nuances
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - uoft-cs/cifar10
5
+ - nyanko7/danbooru2023
6
+ language:
7
+ - en
8
+ ---
9
+ # DDPM Project
10
+
11
+ This repository contains the implementation of Denoising Diffusion Probabilistic Models (DDPM).
12
+
13
+ ## Table of Contents
14
+ - [Introduction](#introduction)
15
+ - [Installation](#installation)
16
+ - [Usage](#usage)
17
+ - [Contributing](#contributing)
18
+
19
+ ## Introduction
20
+ Denoising Diffusion Probabilistic Models (DDPM) are a class of generative models that learn to generate data by reversing a diffusion process. This repository provides a comprehensive implementation of DDPM.
21
+
22
+ ## Installation
23
+ To install the necessary dependencies, run:
24
+ ```bash
25
+ pip install -r requirements.txt
26
+ ```
27
+
28
+ ## Usage
29
+ To train the model, use the following command:
30
+ ```bash
31
+ python train.py
32
+ ```
33
+ To generate samples, use:
34
+ ```bash
35
+ python generate.py
36
+ ```
37
+
38
+ ## Game
39
+ To understand the model and it's workings, we're working on a cool cute little game where the user is the UNET reverser/diffusion model and is tasked to denoise the images with noise made of grids of lines.
40
+
41
+ Use [learndiffusion.vercel.app](learndiffusion.vercel.app) to access the primitive version of the game. You can also contribute to the game by checking out at the diffusion_game branch. A new model showcase will also be added such that the model's weights are loaded from the internet, model's files are installed and loaded into a gradio interface for direct use/inference on the vercel. Feel free to make changes for the same, issue is opened.
42
+
43
+ ## Explanations and Mathematics
44
+ - slides from presentation :
45
+ - notes/explanations : [HERE](slides\notes)
46
+ - a cute lab talk ppt:
47
+ - plato's allegory : \<link to REPUBLIC>
48
+
49
+ ## Resources
50
+ - Original Paper : https://arxiv.org/pdf/2006.11239
51
+ - Improvement Paper : https://arxiv.org/abs/2102.09672
52
+ - Improvement by OpenAI : https://arxiv.org/pdf/2105.05233
53
+ - Stable Diffusion Paper : https://arxiv.org/abs/2112.10752
54
+ -
55
+
56
+ ### Papers for background
57
+ - UNET Paper for Biomedical Segmentation
58
+ - Autoencooder
59
+ - Variational Autoencoder
60
+ - Markov Hierarchical VAE
61
+ - Introductory Lectures on Diffusion Process
62
+
63
+ ### Youtube videos and courses
64
+ #### Mathematics
65
+ - Outliers
66
+ - Omar Jahil
67
+
68
+ #### Pytorch Implementation
69
+ - [Deep Findr](https://www.youtube.com/watch?v=a4Yfz2FxXiY)
70
+ - [Notebook from Deep Findr](https://colab.research.google.com/drive/1sjy9odlSSy0RBVgMTgP7s99NXsqglsUL?usp=sharing)
71
+
72
+ ## Pretrained Weights
73
+ weights from the model can be found in [pretrained_weights](https://drive.google.com/drive/folders/1NiQDI3e67I9FITVnrzNPP2Az0LABRpic?usp=sharing)
74
+
75
+ For loading the pretrained weights:
76
+ ```
77
+ model2 = SimpleUnet()
78
+ model2.load_state_dict(torch.load("/content/drive/MyDrive/Research Work/mlsa/DDPM/model_weights.pth"))
79
+ model2.eval()
80
+ ```
81
+
82
+ For making inferences
83
+ TODO: Errors in the sampling function, boolean errors and etc. Will open issues for solving by others as exercise if needed.
84
+ ```
85
+ num_samples = 8 # Number of images to generate
86
+ image_size = (3, 32, 32) # Example for CIFAR10
87
+ noise = torch.randn(num_samples, *image_size).to("cuda")
88
+
89
+ model2.to("cuda")
90
+ # Generate images by denoising
91
+ with torch.no_grad():
92
+ generated_images = model2.sample(noise)
93
+
94
+ # Save the generated images
95
+ save_image(generated_images, "generated_images.png", nrow=4, normalize=True)
96
+ ```
97
+
98
+
99
+ ## Contributing
100
+ Contributions are welcome! Please open an issue or submit a pull request.
101
+
102
+
103
+ ## Future Ideas
104
+ - Make the model onnx compatible for training and inferencing on Intel GPUs
105
+ - Build a Stable Diffusion model Text2Img using CLIP implementationnnnn !!!
106
+ - Train the current model for a much larger dataset with more generalizations and nuances