Add library name and link to code

This PR adds a link to the Github repository to improve discoverability. It also ensures the appropriate library is known (which is Transformers) and makes it appear on the top right corner of the model page.

Files changed (1) hide show

README.md +6 -4

README.md CHANGED Viewed

@@ -1,10 +1,11 @@
 ---
 language:
 - en
 license: apache-2.0
-datasets:
-- openbmb/UltraFeedback
 pipeline_tag: text-generation
 model-index:
 - name: SPPO-Llama-3-8B-Instruct-GPM-2B
   results:
@@ -104,6 +105,8 @@ model-index:
 General Preference Modeling with Preference Representations for Aligning Language Models (https://arxiv.org/abs/2410.02197)
 # SPPO-Llama-3-8B-Instruct-GPM-2B
 This model was developed using [SPPO](https://arxiv.org/abs/2405.00675) at iteration 3 and the [General Preference representation Model (GPM)](https://arxiv.org/abs/2410.02197) (specifically, using [GPM-Gemma-2B](https://huggingface.co/general-preference/GPM-Gemma-2B)), based on the [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) architecture as starting point. We utilized the prompt sets from the [openbmb/UltraFeedback](https://huggingface.co/datasets/openbmb/UltraFeedback) dataset, splited to 3 parts for 3 iterations by [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset). All responses used are synthetic.
@@ -165,5 +168,4 @@ The following hyperparameters were used during training:
   journal={arXiv preprint arXiv:2410.02197},
   year={2024}
 }
-```

 ---
+datasets:
+- openbmb/UltraFeedback
 language:
 - en
 license: apache-2.0
 pipeline_tag: text-generation
+library_name: transformers
 model-index:
 - name: SPPO-Llama-3-8B-Instruct-GPM-2B
   results:
 General Preference Modeling with Preference Representations for Aligning Language Models (https://arxiv.org/abs/2410.02197)
+This code can be found at https://github.com/general-preference/general-preference-model
 # SPPO-Llama-3-8B-Instruct-GPM-2B
 This model was developed using [SPPO](https://arxiv.org/abs/2405.00675) at iteration 3 and the [General Preference representation Model (GPM)](https://arxiv.org/abs/2410.02197) (specifically, using [GPM-Gemma-2B](https://huggingface.co/general-preference/GPM-Gemma-2B)), based on the [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) architecture as starting point. We utilized the prompt sets from the [openbmb/UltraFeedback](https://huggingface.co/datasets/openbmb/UltraFeedback) dataset, splited to 3 parts for 3 iterations by [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset). All responses used are synthetic.
   journal={arXiv preprint arXiv:2410.02197},
   year={2024}
 }
+```