Gemma Embeddings v0.8

GemmaEmbed is a dense-vector embedding model, trained especially for retrieval. As of December 2, 2024, GemmaEmbed achieves the #1 position overall on the MTEB Retrieval leaderboard, with a score of 63.80.

Important Notes

  • This is not an official Google product.
  • This is a research project.

Results summary

Results compared to BGE-EN-ICL on several large datasets

Model DBPedia FEVER HotPotQA MSMARCO NQ
BGE-EN-ICL 51.63 92.83 85.14 46.79 73.88
Gemma-Embeddings-v0.8 52.58 93.50 87.58 47.13 74.45

Model & Data

Our base encoder model is Gemma2 9B.

We use the BGE-EN-ICL training data.

Research Team

  • Nicholas Monath
  • Michael Boratko
  • Seungyeon Kim
  • Andrew McCallum
  • Rob Fergus
  • Manzil Zaheer
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for google/Gemma-Embeddings-v0.8

Base model

google/gemma-2-9b
Finetuned
(144)
this model

Evaluation results