androstj commited on
Commit
7258142
·
1 Parent(s): b400d63

Fix README.md

Browse files
README.md CHANGED
@@ -13,8 +13,11 @@ Audiobox-Aesthetics is introduced in [Meta Audiobox Aesthetics: Unified Automati
13
 
14
  **Model Developer**: FAIR @ Meta AI
15
 
16
- **Model Architecture**: Audiobox-Aesthetics
17
 
 
 
 
18
 
19
  # How to install
20
  We are providing 2 ways to run the model:
 
13
 
14
  **Model Developer**: FAIR @ Meta AI
15
 
16
+ **Model Architecture**:
17
 
18
+ <img src="assets/aes_model.png" alt="Model" height="400px">
19
+
20
+ Audiobox-Aesthetics is based on simple Transformer-based architecture. Specifically, the audio encoder based on WavLM-based structure, consisted of several CNN and 12 Transformers (Vaswani et al., 2017) layers with 768 hidden dimensions. To predict the output, we project the audio embedding through multiple multi-layer perceptron (MLP) blocks where each MLP block consisted of 5 non-linear layers with respect to each axes (PQ, PC, CE, CU). The model is trained with standard regression loss (Mean-Absolute & Mean-Squared Error).
21
 
22
  # How to install
23
  We are providing 2 ways to run the model:
aes_model.png → assets/aes_model.png RENAMED
File without changes