In a Training Loop 🔄
lewtun
·
AI & ML interests
LLMs, LLMs, LLMs
Organizations
lewtun/Qwen2.5-0.5B-SFT-LoRA
Updated
lewtun/Llama-3.1-8B-SFT-LoRA-packing-no-lm-head
Updated
lewtun/Llama-3.1-8B-SFT-LoRA-no-packing
Updated
lewtun/Llama-3.1-8B-SFT-QLoRA-packing
Updated
lewtun/Llama-3.1-8B-SFT-LoRA-packing-no-saved-modules
Updated
lewtun/Llama-3.1-8B-SFT-LoRA-packing
Updated
lewtun/Llama-3.1-8B-SFT-LoRA-packing-pad-token-eos
Updated
lewtun/Llama-3.1-8B-SFT-QLoRA-packing-pad-token-eos
Updated
lewtun/Llama-3.1-8B-SFT-full-packing
Text Generation
•
8B
•
Updated
•
15
lewtun/Llama-3.1-8B-SFT-LoRA
Updated
Text Classification
•
0.5B
•
Updated
•
15
lewtun/gemma-2-2b-it-gkd-9b
Updated
lewtun/gemma-2-2b-it-gkd-27b
Updated
Text Generation
•
1.03M
•
Updated
•
6
lewtun/sft_openassistant-guanaco
Updated
Text Classification
•
0.5B
•
Updated
•
12
lewtun/pythia-6.9b-deduped-tldr-online-dpo
7B
•
Updated
•
7
lewtun/qwen2-1.5B-ultrafeedback-online-dpo
2B
•
Updated
•
6
lewtun/qwen2-0.5B-ultrafeedback-online-dpo
0.6B
•
Updated
•
7
lewtun/pythia-2.8b-deduped-tldr-online-dpo
3B
•
Updated
•
6
lewtun/qwen2-7B-ultrafeedback-online-dpo-bs-1
Updated
lewtun/qwen2-7B-ultrafeedback-online-dpo-bs-2
Updated
lewtun/qwen2-7B-ultrafeedback-online-dpo
Updated
lewtun/pythia-1b-deduped-tldr-online-dpo
1B
•
Updated
•
7
lewtun/pythia-1b-tldr-online-dpo
Updated
lewtun/qwen2-0.5B-lr-5e-7
Updated