Multimodal Models
Collection
15 items
•
Updated
This SDK enables efficient Open-Vocabulary-Object-Detection using YOLO-Worldv2 Large, optimized for Axera’s NPU-based SoC platforms including AX650 Series, AX630C Series, AX8850 Series, or Axera's dedicated AI accelerator.
For those who are interested in model conversion, you can try to export axmodel through
Model | Input Shape | Latency (ms) | CMM Usage (MB) |
---|---|---|---|
yolo_u16_ax650.axmodel | 1 x 640 x 640 x 3 | 9.522 ms | 21 MB |
clip_b1_u16_ax650.axmodel | 1 x 77 | 2.997 ms | 137 MB |
yolo_u16_ax630c.axmodel | 1 x 640 x 640 x 3 | 43.450 ms | 31 MB |
clip_b1_u16_ax630c.axmodel | 1 x 77 | 10.703 ms | 134 MB |
Download all files from this repository to the device
(py312) axera@raspberrypi:~/samples/yoloworldv2 $ tree
.
├── config.json
├── football.jpg
├── install
│ ├── bin
│ │ ├── axcl_aarch64
│ │ │ └── test_detect_by_text
│ │ ├── axcl_x86
│ │ │ └── test_detect_by_text
│ │ └── host_650
│ │ └── test_detect_by_text
│ └── lib
│ ├── axcl_aarch64
│ │ └── libyoloworld.so
│ ├── axcl_x86
│ │ └── libyoloworld.so
│ └── host_650
│ └── libyoloworld.so
├── models
│ ├── clip_b1_u16_ax630c.axmodel
│ ├── clip_b1_u16_ax650.axmodel
│ ├── yolo_u16_ax630c.axmodel
│ └── yolo_u16_ax650.axmodel
├── pyyoloworld
│ ├── example.py
│ ├── gardio_example.jpg
│ ├── gradio_example.py
│ ├── libyoloworld.so
│ ├── pyaxdev.py
│ ├── __pycache__
│ │ ├── pyaxdev.cpython-312.pyc
│ │ └── pyyoloworld.cpython-312.pyc
│ ├── pyyoloworld.py
│ └── requirements.txt
├── README.md
└── vocab.txt
13 directories, 23 files
pip install -r pyyoloworld/requirements.txt
TODO
What is M.2 Accelerator card?, Show this DEMO based on Raspberry PI 5.
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libstdc++.so.6
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ cp install/lib/axcl_aarch64/libyoloworld.so pyyoloworld/
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ cd pyyoloworld/
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg/pyyoloworld $ python gradio_example.py --yoloworld ../models/yolo_u16_ax650.axmodel --tenc ../models/clip_b1_u16_ax650.axmodel --vocab ../vocab.txt
Trying to load: /home/axera/samples/yoloworldv2-new.hg/pyyoloworld/aarch64/libyoloworld.so
✅ Successfully loaded: /home/axera/samples/yoloworldv2-new.hg/pyyoloworld/libyoloworld.so
[I][ run][ 31]: AXCLWorker start with devid 0
input size: 2
name: images [unknown] [unknown]
1 x 640 x 640 x 3 size: 1228800
name: txt_feats [unknown] [unknown]
1 x 4 x 512 size: 8192
output size: 3
name: stride8
1 x 80 x 80 x 68 size: 1740800
name: stride16
1 x 40 x 40 x 68 size: 435200
name: stride32
1 x 20 x 20 x 68 size: 108800
[I][ yw_create][ 408]: num_classes: 4, num_features: 512, input w: 640, h: 640
is_output_nhwc: 1
input size: 1
name: text_token [unknown] [unknown]
1 x 77 size: 308
output size: 1
name: 2202
1 x 1 x 512 size: 2048
[I][ load_text_encoder][ 44]: text feature len 512
[I][ load_tokenizer][ 60]: text token len 77
* Running on local URL: http://0.0.0.0:7860
* To create a public link, set `share=True` in `launch()`.
If your Raspberry PI 5 IP Address is 192.168.1.100, so using this URL http://192.168.1.100:7860
with your WebApp.
Input:man
, shoes
, ball
, person
and the test image
Result: