sentence-transformers/paraphrase-multilingual-mpnet-base-v2

stevens11951

Aug 23, 2023

Can you elaborate on the models language capabilities in the model card?

piyushsinghpasi

Jan 19

•

edited Jan 19

Hii @stevens11951 sentence transformer page has listed the languages: https://www.sbert.net/docs/sentence_transformer/pretrained_models.html#multilingual-models

Listing here for easy of access to other folks:
lang = [
"ar", "bg", "ca", "cs", "da", "de", "el", "es", "en", "et", "fa", "fi", "fr", "fr-ca",
"gl", "gu", "he", "hi", "hr", "hu", "hy", "id", "it", "ja", "ka", "ko", "ku",
"lt", "lv", "mk", "mn", "mr", "ms", "my", "nb", "nl", "pl", "pt", "pt-br", "ro",
"ru", "sk", "sl", "sq", "sr", "sv", "th", "tr", "uk", "ur", "vi", "zh-cn", "zh-tw",
]
Edit: added en 😅

patlilt

29 days ago

@piyushsinghpasi Could the model card be updated? It currently doesn't list some languages you mention (and that appear in the sentence-transformers documentation). Confusingly, it also uses the xlm-roberta, although (I assume) the model is actually based on MPNet (https://huggingface.co/microsoft/mpnet-base).

piyushsinghpasi

23 days ago

Hii @patlilt Just checked it's already there in the readme and when you click on the 50 languages Tag just below the model ID URL- it will expand to all languages.

patlilt

23 days ago

@piyushsinghpasi Thanks! But for example that list doesn't include Chinese.

sentence-transformers
/

paraphrase-multilingual-mpnet-base-v2

Model card: Language?