AI & ML interests
In the following you find models tuned to be used for sentence / text embedding generation. They can be used with the sentence-transformers package.
Recent Activity
View all activity
Organization Card
SentenceTransformers 🤗 is a Python framework for state-of-the-art sentence, text and image embeddings.
Install the Sentence Transformers library.
pip install -U sentence-transformers
The usage is as simple as:
from sentence_transformers import SentenceTransformer
# 1. Load a pretrained Sentence Transformer model
model = SentenceTransformer("all-MiniLM-L6-v2")
# The sentences to encode
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium.",
]
# 2. Calculate embeddings by calling model.encode()
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# 3. Calculate the embedding similarities
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6660, 0.1046],
# [0.6660, 1.0000, 0.1411],
# [0.1046, 0.1411, 1.0000]])
Hugging Face makes it easy to collaboratively build and showcase your Sentence Transformers models! You can collaborate with your organization, upload and showcase your own models in your profile ❤️

Documentation

Push your Sentence Transformers models to the Hub ❤️

Find all Sentence Transformers models on the 🤗 Hub
To upload your Sentence Transformers models to the Hugging Face Hub, log in with huggingface-cli login
and use the push_to_hub
method within the Sentence Transformers library.
from sentence_transformers import SentenceTransformer
# Load or train a model
model = SentenceTransformer(...)
# Push to Hub
model.push_to_hub("my_new_model")
A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers
These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual.
-
sentence-transformers/parallel-sentences-wikititles
Viewer • Updated • 14.7M • 241 • 1 -
sentence-transformers/parallel-sentences-tatoeba
Viewer • Updated • 8.35M • 1.48k -
sentence-transformers/parallel-sentences-talks
Viewer • Updated • 19.6M • 1.68k • 12 -
sentence-transformers/parallel-sentences-europarl
Viewer • Updated • 49.7M • 584 • 1
A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers
These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual.
-
sentence-transformers/parallel-sentences-wikititles
Viewer • Updated • 14.7M • 241 • 1 -
sentence-transformers/parallel-sentences-tatoeba
Viewer • Updated • 8.35M • 1.48k -
sentence-transformers/parallel-sentences-talks
Viewer • Updated • 19.6M • 1.68k • 12 -
sentence-transformers/parallel-sentences-europarl
Viewer • Updated • 49.7M • 584 • 1
models
126

sentence-transformers/paraphrase-multilingual-mpnet-base-v2
Sentence Similarity
•
0.3B
•
Updated
•
2.9M
•
•
404

sentence-transformers/stsb-mpnet-base-v2
Sentence Similarity
•
0.1B
•
Updated
•
10.6k
•
12

sentence-transformers/paraphrase-mpnet-base-v2
Sentence Similarity
•
0.1B
•
Updated
•
719k
•
43

sentence-transformers/nli-mpnet-base-v2
Sentence Similarity
•
0.1B
•
Updated
•
41.3k
•
14

sentence-transformers/multi-qa-mpnet-base-dot-v1
Sentence Similarity
•
0.1B
•
Updated
•
1.56M
•
•
177

sentence-transformers/multi-qa-mpnet-base-cos-v1
Sentence Similarity
•
0.1B
•
Updated
•
604k
•
•
41

sentence-transformers/all-mpnet-base-v1
Sentence Similarity
•
0.1B
•
Updated
•
3.28k
•
11

sentence-transformers/all-mpnet-base-v2
Sentence Similarity
•
0.1B
•
Updated
•
19.6M
•
•
1.13k

sentence-transformers/average_word_embeddings_levy_dependency
Sentence Similarity
•
Updated

sentence-transformers/average_word_embeddings_komninos
Sentence Similarity
•
Updated
•
4
datasets
90
sentence-transformers/msmarco-scores-ms-marco-MiniLM-L6-v2
Viewer
•
Updated
•
241M
•
114
•
2
sentence-transformers/msmarco
Viewer
•
Updated
•
527M
•
717
•
5
sentence-transformers/msmarco-msmarco-MiniLM-L6-v3
Viewer
•
Updated
•
80.6M
•
619
•
3
sentence-transformers/NanoTouche2020-bm25
Viewer
•
Updated
•
5.84k
•
36
sentence-transformers/NanoSciFact-bm25
Viewer
•
Updated
•
3.02k
•
43
sentence-transformers/NanoArguAna-bm25
Viewer
•
Updated
•
3.74k
•
45
sentence-transformers/NanoSCIDOCS-bm25
Viewer
•
Updated
•
2.31k
•
44
sentence-transformers/NanoQuoraRetrieval-bm25
Viewer
•
Updated
•
5.15k
•
54
sentence-transformers/NanoNQ-bm25
Viewer
•
Updated
•
5.14k
•
482
sentence-transformers/NanoNFCorpus-bm25
Viewer
•
Updated
•
3.05k
•
458