hubertsiuzdak commited on
Commit
8f97ac2
·
verified ·
1 Parent(s): 2491d40

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -0
README.md CHANGED
@@ -1,3 +1,51 @@
1
  ---
2
  license: mit
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ tags:
4
+ - audio
5
  ---
6
+
7
+ # SNAC 🍿
8
+
9
+ Multi-**S**cale **N**eural **A**udio **C**odec (SNAC) compressess 44.1 kHz audio into discrete codes at a low bitrate.
10
+
11
+ See GitHub repository: https://github.com/hubertsiuzdak/snac/
12
+
13
+ ## Overview
14
+
15
+ SNAC encodes audio into hierarchical tokens similarly to SoundStream, EnCodec, and DAC. However, SNAC introduces a simple change where coarse tokens are sampled less frequently,
16
+ covering a broader time span.
17
+
18
+ This model compresses 32 kHz audio into discrete codes at a 1.9 kbps bitrate. It uses 4 RVQ levels with token rates of 10, 21, 42, and
19
+ 83 Hz.
20
+
21
+ ## Usage
22
+
23
+ Install it using:
24
+
25
+ ```bash
26
+ pip install snac
27
+ ```
28
+ To encode (and reconstruct) audio with SNAC in Python, use the following code:
29
+
30
+ ```python
31
+ import torch
32
+ from snac import SNAC
33
+
34
+ model = SNAC.from_pretrained("hubertsiuzdak/snac_32khz").eval().cuda()
35
+ audio = torch.randn(1, 1, 32000).cuda() # B, 1, T
36
+
37
+ with torch.inference_mode():
38
+ audio_hat, _, codes, _, _ = model(audio)
39
+ ```
40
+
41
+ ⚠️ Note that `codes` is a list of token sequences of variable lengths, each corresponding to a different temporal
42
+ resolution.
43
+
44
+ ```
45
+ >>> [code.shape[1] for code in codes]
46
+ [12, 24, 48, 96]
47
+ ```
48
+
49
+ ## Acknowledgements
50
+
51
+ Module definitions are adapted from the [Descript Audio Codec](https://github.com/descriptinc/descript-audio-codec).