MMTEB Collection Our contribution to the Massive Multilingual Text Embedding Benchmark (MMTEB). Retrieval and reranking benchmarks in 16 languages. ā¢ 4 items ā¢ Updated Jun 6, 2024 ā¢ 2
MMTEB: Massive Multilingual Text Embedding Benchmark Paper ā¢ 2502.13595 ā¢ Published 4 days ago ā¢ 26
CommonCrawl Collection Large web-mined general corpus based on CommonCrawl. ā¢ 7 items ā¢ Updated Dec 8, 2024 ā¢ 2
NoLiMa: Long-Context Evaluation Beyond Literal Matching Paper ā¢ 2502.05167 ā¢ Published 16 days ago ā¢ 15
mistralai/Mistral-Small-24B-Instruct-2501 Text Generation ā¢ Updated 21 days ago ā¢ 736k ā¢ ā¢ 809