Model Summary

This is a fork of the original GritLM model. The main difference between this fork and the original model is the name of the architecture used in the config file. The new name is chosen to make it easier to adopt this model in vLLM.

GritLM is a generative representational instruction tuned language model. It unifies text representation (embedding) and text generation into a single model achieving state-of-the-art performance on both types of tasks.

Repository: ContextualAI/gritlm
Paper: https://arxiv.org/abs/2402.09906
Logs: https://wandb.ai/muennighoff/gritlm/runs/0uui712t/overview
Script: https://github.com/ContextualAI/gritlm/blob/main/scripts/training/train_gritlm_7b.sh

Model	Description
GritLM 7B	Mistral 7B finetuned using GRIT
GritLM 8x7B	Mixtral 8x7B finetuned using GRIT

Use

The model usage is documented here.

Citation

@misc{muennighoff2024generative,
      title={Generative Representational Instruction Tuning}, 
      author={Niklas Muennighoff and Hongjin Su and Liang Wang and Nan Yang and Furu Wei and Tao Yu and Amanpreet Singh and Douwe Kiela},
      year={2024},
      eprint={2402.09906},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Dataset used to train parasail-ai/GritLM-7B-vllm

Evaluation results

accuracy on MTEB AmazonCounterfactualClassification (en)
test set self-reported

81.179
ap on MTEB AmazonCounterfactualClassification (en)
test set self-reported

46.263
f1 on MTEB AmazonCounterfactualClassification (en)
test set self-reported

75.446
accuracy on MTEB AmazonPolarityClassification
test set self-reported

96.516
ap on MTEB AmazonPolarityClassification
test set self-reported

94.791
f1 on MTEB AmazonPolarityClassification
test set self-reported

96.515
accuracy on MTEB AmazonReviewsClassification (en)
test set self-reported

57.806
f1 on MTEB AmazonReviewsClassification (en)
test set self-reported

56.784
map_at_1 on MTEB ArguAna
test set self-reported

38.478
map_at_10 on MTEB ArguAna
test set self-reported

54.955

View on Papers With Code