lokinfey
/

Phi-3.5-mini-instruct-onnx-gpu

Model card Files Files and versions Community

Phi-3.5-mini-instruct-onnx-gpu Unofficial version

Note: This is unoffical version,just for test and dev.

This is a Phi-3.5-mini-instruct version of ONNX GPU, based on ONNX Runtime for GenAI https://github.com/microsoft/onnxruntime-genai. Convert with the following command

1. Install the SDK


pip install torch transformers onnx onnxruntime


pip install --pre onnxruntime-genai

2. Convert GPU ONNX Support


python3 -m onnxruntime_genai.models.builder -m microsoft/Phi-3.5-mini-instruct -o ./onnx-gpu -p int4 -e cuda -c ./Phi-3.5-mini-instruct

This is a conversion, but no specific optimization has been done. Please look forward to the official version.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including lokinfey/Phi-3.5-mini-instruct-onnx-gpu

Phi-3.5-Family

Quantifying and transforming models from Phi-3.5 Family • 8 items • Updated 23 days ago