Very early test model (untested as of the moment.)
Base model: https://huggingface.co/arcee-ai/AFM-4.5B-Preview
5k entries of mixed sft data at lr 2e-6 in rank/alpha 32 4bit qlora with ebs 32 (bs 8 grad_accum 4) for a total of 2 epochs using cosine.
- Downloads last month
- 6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support