Fix README.md
Browse files- README.md +4 -1
- aes_model.png → assets/aes_model.png +0 -0
README.md
CHANGED
@@ -13,8 +13,11 @@ Audiobox-Aesthetics is introduced in [Meta Audiobox Aesthetics: Unified Automati
|
|
13 |
|
14 |
**Model Developer**: FAIR @ Meta AI
|
15 |
|
16 |
-
**Model Architecture**:
|
17 |
|
|
|
|
|
|
|
18 |
|
19 |
# How to install
|
20 |
We are providing 2 ways to run the model:
|
|
|
13 |
|
14 |
**Model Developer**: FAIR @ Meta AI
|
15 |
|
16 |
+
**Model Architecture**:
|
17 |
|
18 |
+
<img src="assets/aes_model.png" alt="Model" height="400px">
|
19 |
+
|
20 |
+
Audiobox-Aesthetics is based on simple Transformer-based architecture. Specifically, the audio encoder based on WavLM-based structure, consisted of several CNN and 12 Transformers (Vaswani et al., 2017) layers with 768 hidden dimensions. To predict the output, we project the audio embedding through multiple multi-layer perceptron (MLP) blocks where each MLP block consisted of 5 non-linear layers with respect to each axes (PQ, PC, CE, CU). The model is trained with standard regression loss (Mean-Absolute & Mean-Squared Error).
|
21 |
|
22 |
# How to install
|
23 |
We are providing 2 ways to run the model:
|
aes_model.png → assets/aes_model.png
RENAMED
File without changes
|