Convert from TinyLlama/TinyLlama-1.1B-Chat-v1.0 and 4 bits quantized.

Require onnxruntime>=0.17.0

Downloads last month
8
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support text-generation models for transformers.js library.

Model tree for BricksDisplay/TinyLlama-1.1B-Chat-v1.0-q4

Quantized
(78)
this model

Collection including BricksDisplay/TinyLlama-1.1B-Chat-v1.0-q4