Alireo-400M ๐Ÿค– ๐Ÿ‡ฎ๐Ÿ‡น

A Lightweight Italian Language Model

Model Description ๐Ÿ“

Alireo-400M is a lightweight yet powerful Italian language model with 400M parameters, designed to provide efficient natural language processing capabilities while maintaining a smaller footprint compared to larger models.

Key Features โœจ

  • Architecture: Transformer-based language model ๐Ÿ—๏ธ
  • Parameters: 400M ๐Ÿ“Š
  • Context Window: 8K tokens ๐ŸชŸ
  • Training Data: Curated Italian text corpus (books, articles, web content) ๐Ÿ“š
  • Model Size: ~800MB ๐Ÿ’พ

Performance ๐Ÿ“ˆ

Despite its compact size, Alireo-400M demonstrates impressive performance:

  • Benchmark Results: Outperforms Qwen 0.5B across multiple benchmarks ๐Ÿ†
  • Language Understanding: Maintains high accuracy in Italian language understanding tasks ๐ŸŽฏ
  • Speed: Efficient inference speed due to optimized architecture โšก

Limitations โš ๏ธ

  • Limited context window compared to larger models
  • May struggle with highly specialized technical content
  • Performance may vary on dialectal variations
  • Not suitable for multilingual tasks

Hardware Requirements ๐Ÿ’ป

  • Minimum RAM: 2GB
  • Recommended RAM: 4GB
  • GPU: Optional, but recommended for faster inference
  • Disk Space: ~1GB (including model and dependencies)

Citation ๐Ÿ“„

@software{alireo2024,
  author = {[Michele Montebovi]},
  title = {Alireo-400M: A Lightweight Italian Language Model},
  year = {2024},
}
Downloads last month
2,816
Safetensors
Model size
404M params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Datasets used to train DeepMount00/Alireo-400m-instruct-v0.1