--- base_model: - Qwen/Qwen2.5-14B pipeline_tag: text-generation library_name: transformers tags: - rknn - rkllm - chat - rk3588 --- ## 3ib0n's RKLLM Guide These models and binaries require an RK3588 board running rknpu driver version 0.9.7 or above ## Steps to reproduce conversion ```shell # Download and setup miniforge3 curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" bash Miniforge3-$(uname)-$(uname -m).sh # activate the base environment source ~/miniforge3/bin/activate # create and activate a python 3.8 environment conda create -n rknn-llm-1.1.4 python=3.8 conda activate rknn-llm-1.1.4 # clone the lastest rknn-llm toolkit git clone https://github.com/airockchip/rknn-llm.git # intstall dependencies for the toolkit pip install transformers accelerate torchvision rknn-toolkit2==2.2.1 pip install --upgrade torch pillow # install rkllm pip install ../../rkllm-toolkit/packages/rkllm_toolkit-1.1.4-cp38-cp38-linux_x86_64.whl # edit or create a script to export rkllm models cd rknn-llm/examples/rkllm_multimodal_demo nano export/export_rkllm.py # update input and output paths python export/export_rkllm.py ``` Example export_rkllm.py modified from https://github.com/airockchip/rknn-llm/blob/main/examples/rkllm_multimodel_demo/export/export_rkllm.py ```python import os from rkllm.api import RKLLM from datasets import load_dataset from transformers import AutoTokenizer from tqdm import tqdm import torch from torch import nn modelpath = "~/models/Qwen/Qwen2.5-14B-Instruct/" ## UPDATE HERE savepath = './Qwen2.5-14B-Instruct.rkllm' ## UPDATE HERE llm = RKLLM() # Load model # Use 'export CUDA_VISIBLE_DEVICES=2' to specify GPU device ret = llm.load_huggingface(model=modelpath, device='cpu') if ret != 0: print('Load model failed!') exit(ret) # Build model qparams = None ## Do not use the dataset parameter as we are converting a pure text model, not a multimodal ret = llm.build(do_quantization=True, optimization_level=1, quantized_dtype='w8a8', quantized_algorithm='normal', target_platform='rk3588', num_npu_core=3, extra_qparams=qparams) if ret != 0: print('Build model failed!') exit(ret) # # Export rkllm model ret = llm.export_rkllm(savepath) if ret != 0: print('Export model failed!') exit(ret) ``` ## Steps to build and run demo ```shell # Dwonload the correct toolchain for working with rkllm # Documentation here: https://github.com/airockchip/rknn-llm/blob/main/doc/Rockchip_RKLLM_SDK_EN_1.1.0.pdf wget https://developer.arm.com/-/media/Files/downloads/gnu-a/10.2-2020.11/binrel/gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu.tar.xz tar -xz gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu.tar.xz # ensure that the gcc compiler path is set to the location where the toolchain dowloaded earlier is unpacked nano deploy/build-linux.sh # update the gcc compiler path # compile the demo app cd delpoy/ ./build-linux.sh ``` ## Steps to run the app More information and original guide: https://github.com/airockchip/rknn-llm/tree/main/examples/rkllm_multimodel_demo ```shell # push install dir to device adb push ./install/demo_Linux_aarch64 /data # push model file to device adb push Qwen2.5-14B-Instruct.rkllm /data/models adb shell cd /data/demo_Linux_aarch64 # export lib path export LD_LIBRARY_PATH=./lib # soft link models dir ln -s /data/models . # run llm(Pure Text Example) ./llm models/Qwen2.5-14B-Instruct.rkllm 128 512 ```