ChaiML
/

gpt2_base_retry_and_continue_5m_reward_model

Text Classification

text-generation-inference

Model card Files Files and versions Community

Jellywibble commited on Mar 6, 2023

Commit

51c1150

·

1 Parent(s): 4617606

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -67,4 +67,5 @@ The original dataset contains over 50 million rows of completions (chatbot respo
 </figure>
 ### Training procedure
-The `gpt2_base_retry_and_continue_5m_reward_model` was trained using a [gpt2](https://huggingface.co/gpt2) base model and a classification head with single output. Binary Cross Entropy loss was used. The model was trained on 4xA40 GPUs, 16 per device batch size and gradient accumulation of 1 (therefore the effective batch size is 64), with 1e-5 learning rate for 2 epochs for a total of 156,240 steps. Tensor parallelism and pipeline parallelism were used to distribute the model across GPUs.

 </figure>
 ### Training procedure
+The `gpt2_base_retry_and_continue_5m_reward_model` was trained using a [gpt2](https://huggingface.co/gpt2) base model and a classification head with single output. Binary Cross Entropy loss was used. The model was trained on 4xA40 GPUs, 16 per device batch size and gradient accumulation of 1 (therefore the effective batch size is 64), with 1e-5 learning rate for 2 epochs for a total of 156,240 steps. Tensor parallelism and pipeline parallelism were used to distribute the model across GPUs.
+[Weights and Biases Log](https://wandb.ai/jellywibble/reward)