Update Readme ST Model Zoo
Browse files
README.md
CHANGED
@@ -1,7 +1,3 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
pipeline_tag: audio-classification
|
4 |
-
---
|
5 |
# Quantized Yamnet
|
6 |
|
7 |
## **Use case** : `AED`
|
@@ -80,31 +76,30 @@ For Yamnet-1024
|
|
80 |
|
81 |
* `tl` stands for "transfer learning", meaning that the model backbone weights were initialized from a pre-trained model, then only the last layer was unfrozen during the training.
|
82 |
|
83 |
-
|
84 |
### Reference **NPU** memory footprint based on ESC-10 dataset
|
85 |
|Model | Dataset | Format | Resolution | Series | Internal RAM (KiB) | External RAM (KiB) | Weights Flash (KiB) | STM32Cube.AI version | STEdgeAI Core version |
|
86 |
|----------|------------------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
|
87 |
-
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | esc-10 | Int8 | 64x96x1 | STM32N6 | 144 | 0.0 |
|
88 |
-
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | esc-10 | Int8 | 64x96x1 | STM32N6 | 144 | 0.0 |
|
89 |
|
90 |
### Reference **NPU** inference time based on ESC-10 dataset
|
91 |
| Model | Dataset | Format | Resolution | Board | Execution Engine | Inference time (ms) | Inf / sec | STM32Cube.AI version | STEdgeAI Core version |
|
92 |
|--------|------------------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
|
93 |
-
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | esc-10 | Int8 | 64x96x1 | STM32N6570-DK | NPU/MCU | 1.
|
94 |
-
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | esc-10 | Int8 | 64x96x1 | STM32N6570-DK | NPU/MCU | 9.88 | 101.21 | 10.
|
95 |
|
96 |
|
97 |
### Reference **MCU** memory footprint based on ESC-10 dataset
|
98 |
| Model | Format | Resolution | Series | Activation RAM (kB) | Runtime RAM (kB) | Weights Flash (kB) | Code Flash (kB) | Total RAM (kB) | Total Flash (kB) | STM32Cube.AI version |
|
99 |
|-------------------|--------|------------|---------|----------------|-------------|---------------|------------|-------------|-------------|-----------------------|
|
100 |
-
|[Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | Int8 | 64x96x1 | B-U585I-IOT02A | 109.57 | 7.61 | 135.91 |
|
101 |
-
|[Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | Int8 | 64x96x1 | STM32N6 |
|
102 |
|
103 |
### Reference inference time based on ESC-10 dataset
|
104 |
| Model | Format | Resolution | Board | Execution Engine | Frequency | Inference time | STM32Cube.AI version |
|
105 |
|-------------------|--------|------------|------------------|------------------|--------------|-----------------|-----------------------|
|
106 |
-
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | Int8 | 64x96x1 | B-U585I-IOT02A | 1 CPU | 160 MHz |
|
107 |
-
|[Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | Int8 | 64x96x1 | STM32N6 | 1 CPU + 1 NPU | 800MhZ/1000MhZ |
|
108 |
|
109 |
|
110 |
### Accuracy with ESC-10 dataset
|
@@ -145,5 +140,3 @@ Note that accuracy with unknown class is lower. This is normal
|
|
145 |
|
146 |
Please refer to the stm32ai-modelzoo-services GitHub [here](https://github.com/STMicroelectronics/stm32ai-modelzoo-services)
|
147 |
|
148 |
-
|
149 |
-
|
|
|
|
|
|
|
|
|
|
|
1 |
# Quantized Yamnet
|
2 |
|
3 |
## **Use case** : `AED`
|
|
|
76 |
|
77 |
* `tl` stands for "transfer learning", meaning that the model backbone weights were initialized from a pre-trained model, then only the last layer was unfrozen during the training.
|
78 |
|
|
|
79 |
### Reference **NPU** memory footprint based on ESC-10 dataset
|
80 |
|Model | Dataset | Format | Resolution | Series | Internal RAM (KiB) | External RAM (KiB) | Weights Flash (KiB) | STM32Cube.AI version | STEdgeAI Core version |
|
81 |
|----------|------------------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
|
82 |
+
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | esc-10 | Int8 | 64x96x1 | STM32N6 | 144 | 0.0 | 167.7 | 10.2.0 | 2.2.0 |
|
83 |
+
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | esc-10 | Int8 | 64x96x1 | STM32N6 | 144 | 0.0 | 3450.58 | 10.2.0 | 2.2.0 |
|
84 |
|
85 |
### Reference **NPU** inference time based on ESC-10 dataset
|
86 |
| Model | Dataset | Format | Resolution | Board | Execution Engine | Inference time (ms) | Inf / sec | STM32Cube.AI version | STEdgeAI Core version |
|
87 |
|--------|------------------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
|
88 |
+
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | esc-10 | Int8 | 64x96x1 | STM32N6570-DK | NPU/MCU | 1.05 | 952.38 | 10.2.0 | 2.2.0 |
|
89 |
+
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | esc-10 | Int8 | 64x96x1 | STM32N6570-DK | NPU/MCU | 9.88 | 101.21 | 10.2.0 | 2.2.0 |
|
90 |
|
91 |
|
92 |
### Reference **MCU** memory footprint based on ESC-10 dataset
|
93 |
| Model | Format | Resolution | Series | Activation RAM (kB) | Runtime RAM (kB) | Weights Flash (kB) | Code Flash (kB) | Total RAM (kB) | Total Flash (kB) | STM32Cube.AI version |
|
94 |
|-------------------|--------|------------|---------|----------------|-------------|---------------|------------|-------------|-------------|-----------------------|
|
95 |
+
|[Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | Int8 | 64x96x1 | B-U585I-IOT02A | 109.57 | 7.61 | 135.91 | 56.67 | 117.18 | 192.58 | 10.2.0 |
|
96 |
+
|[Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | Int8 | 64x96x1 | STM32N6 | 144.0 | 1.67 | 3450.58 | 252.48 | 145.67 | 3703.06 | 10.2.0 |
|
97 |
|
98 |
### Reference inference time based on ESC-10 dataset
|
99 |
| Model | Format | Resolution | Board | Execution Engine | Frequency | Inference time | STM32Cube.AI version |
|
100 |
|-------------------|--------|------------|------------------|------------------|--------------|-----------------|-----------------------|
|
101 |
+
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | Int8 | 64x96x1 | B-U585I-IOT02A | 1 CPU | 160 MHz | 279.62 ms | 10.2.0
|
102 |
+
|[Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | Int8 | 64x96x1 | STM32N6 | 1 CPU + 1 NPU | 800MhZ/1000MhZ | 9.88 ms | 10.2.0
|
103 |
|
104 |
|
105 |
### Accuracy with ESC-10 dataset
|
|
|
140 |
|
141 |
Please refer to the stm32ai-modelzoo-services GitHub [here](https://github.com/STMicroelectronics/stm32ai-modelzoo-services)
|
142 |
|
|
|
|