shivash
/

testingmodel

enhanced_hybrid_transformer

Model card Files Files and versions

testingmodel / README.md

shivash's picture

Upload Enhanced Hybrid Transformer 416M weights 🚀

701cfd9 verified 4 months ago

|

history blame contribute delete

562 Bytes

Enhanced Hybrid Transformer 416M

🚀 416,417,792 parameter transformer with modern optimizations.

Features

24 layers × 16 heads
GQA-4 (Grouped Query Attention)
SwiGLU activation
RMSNorm normalization
RoPE positional embeddings

Contents

pytorch_model.bin - Model weights
config.json - Model configuration
tokenizer.json - Tokenizer files
README.md - This file

Usage

Load with the original repository code for full functionality.

🚀 Generated with Claude Code