Update README.md
Browse files
README.md
CHANGED
|
@@ -1625,6 +1625,8 @@ language:
|
|
| 1625 |
<img src="https://huggingface.co/zeroshot/bge-small-en-v1.5-quant/resolve/main/latency.png" alt="latency" width="600" style="display:inline-block; margin-right:10px;"/>
|
| 1626 |
</div>
|
| 1627 |
|
|
|
|
|
|
|
| 1628 |
## Usage
|
| 1629 |
|
| 1630 |
This is the quantized (INT8) ONNX variant of the [bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) embeddings model accelerated with [Sparsify](https://github.com/neuralmagic/sparsify) for quantization and [DeepSparseSentenceTransformers](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/sentence_transformers) for inference.
|
|
|
|
| 1625 |
<img src="https://huggingface.co/zeroshot/bge-small-en-v1.5-quant/resolve/main/latency.png" alt="latency" width="600" style="display:inline-block; margin-right:10px;"/>
|
| 1626 |
</div>
|
| 1627 |
|
| 1628 |
+
DeepSparse improves latency performance by 3X on 10 core laptop and 5X on an AWS instance.
|
| 1629 |
+
|
| 1630 |
## Usage
|
| 1631 |
|
| 1632 |
This is the quantized (INT8) ONNX variant of the [bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) embeddings model accelerated with [Sparsify](https://github.com/neuralmagic/sparsify) for quantization and [DeepSparseSentenceTransformers](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/sentence_transformers) for inference.
|