This repository is a fork of philschmid/all-MiniLM-L6-v2-optimum-embeddings. My own ONNX conversion seems to be about 4x slower, no discernable reason why: the quantized models seem roughly the same. The idea here is by forking we can ex. upgrade the Optimum lib used as well.

Downloads last month
61
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.