NGME-LLama 264M

  • Trained on 4 A6000 for ~4 days
  • Trained ~4 Billion (4 * 16 * 768 * 100_000) Tokens
  • On C4 Corpus
Downloads last month
5
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train PatrickHaller/ngme-llama-264M