Segment Anything 2.1 RKNN2

(English README see below)

在RK3588上运行强大的Segment Anything 2.1图像分割模型!

推理速度(RK3588):
- Encoder(Tiny)(单NPU核): 3s
- Encoder(Small)(单NPU核): 3.5s
- Encoder(Large)(单NPU核): 12s
- Decoder(CPU): 0.1s
内存占用(RK3588):
- Encoder(Tiny): 0.95GB
- Encoder(Small): 1.1GB
- Encoder(Large): 4.1GB
- Decoder: 非常小, 可以忽略不计

使用方法

克隆或者下载此仓库到本地. 模型较大, 请确保有足够的磁盘空间.
安装依赖

pip install numpy<2 pillow matplotlib opencv-python onnxruntime rknn-toolkit-lite2

运行

python test_rknn.py

你可以修改test_rknn.py中这一部分

def main():
    # 1. 加载原始图片
    path = "dog.jpg"
    orig_image, input_image, (scale, offset_x, offset_y) = load_image(path)
    decoder_path = "sam2.1_hiera_small_decoder.onnx"
    encoder_path = "sam2.1_hiera_small_encoder.rknn"
    ...

来测试不同的模型和图片. 注意, 和SAM1不同, 这里的encoder和decoder必须使用同一个版本的模型.

模型转换

安装依赖

pip install numpy<2 onnxslim onnxruntime rknn-toolkit2 sam2

下载SAM2.1的pt模型文件. 可以从这里下载.
转换pt模型到onnx模型. 以Tiny模型为例:

python ./export_onnx.py --model_type sam2.1_hiera_tiny --checkpoint ./sam2.1_hiera_tiny.pt --output_encoder ./sam2.1_hiera_tiny_encoder.onnx --output_decoder sam2.1_hiera_tiny_decoder.onnx

将onnx模型转换为rknn模型. 以Tiny模型为例:

python ./convert_rknn.py sam2.1_hiera_tiny

如果在常量折叠时报错, 请尝试更新onnxruntime到最新版本.

已知问题

只实现了图片分割, 没有实现视频分割.
由于RKNN-Toolkit2的问题, decoder模型在转换时会报错, 暂时需要使用CPU onnxruntime运行, 会略微增加CPU占用.

参考

English README

Run the powerful Segment Anything 2.1 image segmentation model on RK3588!

Inference Speed (RK3588):
- Encoder(Tiny)(Single NPU Core): 3s
- Encoder(Small)(Single NPU Core): 3.5s
- Encoder(Large)(Single NPU Core): 12s
- Decoder(CPU): 0.1s
Memory Usage (RK3588):
- Encoder(Tiny): 0.95GB
- Encoder(Small): 1.1GB
- Encoder(Large): 4.1GB
- Decoder: Negligible

Usage

Clone or download this repository. Models are large, please ensure sufficient disk space.
Install dependencies

pip install numpy<2 pillow matplotlib opencv-python onnxruntime rknn-toolkit-lite2

python test_rknn.py

You can modify this part in test_rknn.py

def main():
    # 1. Load original image
    path = "dog.jpg"
    orig_image, input_image, (scale, offset_x, offset_y) = load_image(path)
    decoder_path = "sam2.1_hiera_small_decoder.onnx"
    encoder_path = "sam2.1_hiera_small_encoder.rknn"
    ...

to test different models and images. Note that unlike SAM1, the encoder and decoder must use the same version of the model.

Model Conversion

Install dependencies

pip install numpy<2 onnxslim onnxruntime rknn-toolkit2 sam2

Download SAM2.1 pt model files. You can download them from here.
Convert pt models to onnx models. Taking Tiny model as an example:

python ./export_onnx.py --model_type sam2.1_hiera_tiny --checkpoint ./sam2.1_hiera_tiny.pt --output_encoder ./sam2.1_hiera_tiny_encoder.onnx --output_decoder sam2.1_hiera_tiny_decoder.onnx

Convert onnx models to rknn models. Taking Tiny model as an example:

python ./convert_rknn.py sam2.1_hiera_tiny

If you encounter errors during constant folding, try updating onnxruntime to the latest version.

Known Issues

Only image segmentation is implemented, video segmentation is not supported.
Due to issues with RKNN-Toolkit2, the decoder model conversion will fail. Currently, it needs to run on CPU using onnxruntime, which will slightly increase CPU usage.

happyme531
/

Segment-Anything-2.1-RKNN2