This is a work-in-progress attempt to use a Phi-3 model with Unity Sentis.

It's the microsoft/Phi-3-mini-4k-instruct model converted to .onnx via optimum:

from optimum.onnxruntime import ORTModelForCausalLM

model_id = "microsoft/Phi-3-mini-4k-instruct"
model = ORTModelForCausalLM.from_pretrained(model_id, use_cache = False, use_io_binding=False, export=True, trust_remote_code=True, cache_dir=".")
model.save_pretrained("phi3_onnx_no_cache/")

Then quantized to a Uint8 .sentis model file using Sentis v1.6.0-pre.1

Usage will be possible with the com.doji.transformers library, but support for LLMs is not officially released yet.

Downloads last month
55
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.