metadata
license: mit
language:
- en
- zh
tags:
- YOLO World
pipeline_tag: zero-shot-object-detection
YOLOWorld
This SDK enables efficient Open-Vocabulary-Object-Detection using YOLO-Worldv2 Large, optimized for Axera’s NPU-based SoC platforms including AX650 Series, AX630C Series, AX8850 Series, or Axera's dedicated AI accelerator.
References links:
For those who are interested in model conversion, you can try to export axmodel through
- The github repo of yoloworld.axera open source
- How to convert the yoloworld models
- Pulsar2 Link, How to Convert ONNX to axmodel
Support Platform
- AX650
- AX630C
Performance
Model | Input Shape | Latency (ms) | CMM Usage (MB) |
---|---|---|---|
yolo_u16_ax650.axmodel | 1 x 640 x 640 x 3 | 9.522 ms | 21 MB |
clip_b1_u16_ax650.axmodel | 1 x 77 | 2.997 ms | 137 MB |
yolo_u16_ax630c.axmodel | 1 x 640 x 640 x 3 | 43.450 ms | 31 MB |
clip_b1_u16_ax630c.axmodel | 1 x 77 | 10.703 ms | 134 MB |
How to use
Download all files from this repository to the device
(py312) axera@raspberrypi:~/samples/yoloworldv2 $ tree
.
├── config.json
├── football.jpg
├── install
│ ├── bin
│ │ ├── axcl_aarch64
│ │ │ └── test_detect_by_text
│ │ ├── axcl_x86
│ │ │ └── test_detect_by_text
│ │ └── host_650
│ │ └── test_detect_by_text
│ └── lib
│ ├── axcl_aarch64
│ │ └── libyoloworld.so
│ ├── axcl_x86
│ │ └── libyoloworld.so
│ └── host_650
│ └── libyoloworld.so
├── models
│ ├── clip_b1_u16_ax630c.axmodel
│ ├── clip_b1_u16_ax650.axmodel
│ ├── yolo_u16_ax630c.axmodel
│ └── yolo_u16_ax650.axmodel
├── pyyoloworld
│ ├── example.py
│ ├── gardio_example.jpg
│ ├── gradio_example.py
│ ├── libyoloworld.so
│ ├── pyaxdev.py
│ ├── __pycache__
│ │ ├── pyaxdev.cpython-312.pyc
│ │ └── pyyoloworld.cpython-312.pyc
│ ├── pyyoloworld.py
│ └── requirements.txt
├── README.md
└── vocab.txt
13 directories, 23 files
python env requirement
pip install -r pyyoloworld/requirements.txt
Inference with AX650 Host, such as M4N-Dock(爱芯派Pro)
TODO
Inference with M.2 Accelerator card
What is M.2 Accelerator card?, Show this DEMO based on Raspberry PI 5.
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libstdc++.so.6
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ cp install/lib/axcl_aarch64/libyoloworld.so pyyoloworld/
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg $ cd pyyoloworld/
(py312) axera@raspberrypi:~/samples/yoloworldv2-new.hg/pyyoloworld $ python gradio_example.py --yoloworld ../models/yolo_u16_ax650.axmodel --tenc ../models/clip_b1_u16_ax650.axmodel --vocab ../vocab.txt
Trying to load: /home/axera/samples/yoloworldv2-new.hg/pyyoloworld/aarch64/libyoloworld.so
✅ Successfully loaded: /home/axera/samples/yoloworldv2-new.hg/pyyoloworld/libyoloworld.so
[I][ run][ 31]: AXCLWorker start with devid 0
input size: 2
name: images [unknown] [unknown]
1 x 640 x 640 x 3 size: 1228800
name: txt_feats [unknown] [unknown]
1 x 4 x 512 size: 8192
output size: 3
name: stride8
1 x 80 x 80 x 68 size: 1740800
name: stride16
1 x 40 x 40 x 68 size: 435200
name: stride32
1 x 20 x 20 x 68 size: 108800
[I][ yw_create][ 408]: num_classes: 4, num_features: 512, input w: 640, h: 640
is_output_nhwc: 1
input size: 1
name: text_token [unknown] [unknown]
1 x 77 size: 308
output size: 1
name: 2202
1 x 1 x 512 size: 2048
[I][ load_text_encoder][ 44]: text feature len 512
[I][ load_tokenizer][ 60]: text token len 77
* Running on local URL: http://0.0.0.0:7860
* To create a public link, set `share=True` in `launch()`.
If your Raspberry PI 5 IP Address is 192.168.1.100, so using this URL http://192.168.1.100:7860
with your WebApp.
Input:man
, shoes
, ball
, person
and the test image

Result: