Baseline First Run Checkpoint

FSDP checkpoint from training run global_step_264.

Files

This repository contains FSDP sharded checkpoint files:

  • model_world_size_8_rank_*.pt: Model weights sharded across 8 GPUs
  • optim_world_size_8_rank_*.pt: Optimizer states
  • extra_state_world_size_8_rank_*.pt: Extra training state
  • fsdp_config.json: FSDP configuration
  • huggingface/: Tokenizer and config files
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support