nvidia/mamba2-8b-3t-4k
Text Generation
•
Updated
•
21
A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers.
Totally Free + Zero Barriers + No Login Required