This repository is a fork of philschmid/all-MiniLM-L6-v2-optimum-embeddings. My own ONNX conversion seems to be about 4x slower, no discernable reason why: the quantized models seem roughly the same. The idea here is by forking we can ex. upgrade the Optimum lib used as well.
- Downloads last month
- 61
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.