YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Weights for the softmax model from the paper 2Mamba2Furious: Linear in Complexity, Competitive in Accuracy. This model variant uses traditional mamba, used for the NIAH experiment. It was trained for 400K steps with a batch size of 32. More details of the setup can be found in the Github repo.

Instructions on how to use this model can be found in https://github.com/gmongaras/2Mamba2Furious

Downloads last month
24
Safetensors
Model size
0.7B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including gmongaras/medium_8192sl_gpu_64bs__mamba

Paper for gmongaras/medium_8192sl_gpu_64bs__mamba