Gemma Embeddings v1.0

GemmaEmbed is a dense-vector embedding model, trained especially for retrieval. As of December 12, 2024, GemmaEmbed achieves the #1 position overall on the MTEB leaderboard, with a score of 72.72.

Important Notes

  • This is not an official Google product.
  • This is a research project.

Results summary

Results comparing with BGE-EN-ICL and NV-Embed-v2 on each task in MTEB:

Model Total (56) Classification (12) Classification Pair (3) STS (10) Clustering (11) Reranking (4) Retrieval (15) Summary (1)
bge-en-icl 0.7167 0.8895 0.8814 0.8425 0.5789 0.5986 0.6216 0.3077
NV-Embed-v2 0.7231 0.9037 0.8867 0.8431 0.5846 0.6065 0.6265 0.3070
Gemma-Embeddings-v1.0 0.7272 0.9000 0.8809 0.8423 0.5826 0.6214 0.6371 0.4052

Model & Data

Our base encoder model is Gemma2 9B.

We use the BGE-EN-ICL training data.

Research Team

  • Nicholas Monath
  • Michael Boratko
  • Seungyeon Kim
  • Andrew McCallum
  • Rob Fergus
  • Manzil Zaheer
Downloads last month
98
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for google/Gemma-Embeddings-v1.0

Base model

google/gemma-2-9b
Finetuned
(144)
this model

Evaluation results