Aleph-Alpha (Aleph Alpha)

Organization Card

Aleph Alpha is dedicated to building sovereign and trustworthy AI systems. Our research has produced state-of-the-art multi-modal models (MAGMA), explainability techniques for transformer-based models (AtMan), and a comprehensive evaluation framework for large-scale model assessment. We have also researched how to move beyond traditional tokenizers. Our work on tokenizer-free architectures uses byte-level trigrams to create more resilient and adaptable models in non-english languages and new domains. Key models demonstrating the effectiveness of our innovative Hierarchical Autoregressive Transformer (HAT) architecture include:

llama-3_1-tfree-hat models: This model family replaces the Llama 3.1 tokenizer with our HAT architecture. The 8b-dpo model is tuned for helpfulness and reduced refusal in sensitive applications, while the larger 70b-sft model is trained on English/German for improved text compression and adaptability.
TFree-HAT-Pretrained-7B-Base: This 7B model was pretrained from scratch in English & German and has a context length of 32,900 words. It shows strong proficiency in German and beats Llama 3.1 on many English benchmarks.

We also published a SOTA German Dataset (data, arXiv), which can be used to enhance German LLM capabilities.

Our future work is dedicated to advancing reasoning models, de-biasing frontier models, understanding the role of data in model training, comprehensive and realistic model evaluation, pushing the boundaries of small models, and advancing tokenizer-free architectures. We will continue to concentrate on creating transparent, trustworthy, and auditable systems that provide users with greater control and insight into the decision-making processes of AI models.

Want to shape the future of sovereign AI? Work with us.

Collections 4

View 4 collections

models 29

datasets 3

Aleph-Alpha/Aleph-Alpha-GermanWeb

Viewer • Updated May 16 • 1.41B • 434 • 17

Aleph-Alpha/MTBench-German

Updated Apr 17 • 7

Aleph-Alpha/ticket-classification

Updated Mar 14, 2024 • 4

Aleph Alpha

AI & ML interests

Recent Activity

Collections 4

Aleph-Alpha/llama-3_1-70b-tfree-hat-sft

Aleph-Alpha/llama-tfree-hat-pretrained-7b-dpo

Aleph-Alpha/tfree-hat-pretrained-7b-base

Aleph-Alpha/llama-3_1-70b-tfree-hat-sft

Aleph-Alpha/llama-tfree-hat-pretrained-7b-dpo

Aleph-Alpha/tfree-hat-pretrained-7b-base

models 29

Aleph-Alpha/tfree-hat-pretrained-7b-base

Aleph-Alpha/llama-tfree-hat-pretrained-7b-dpo

Aleph-Alpha/llama-3_1-8b-tfree-hat-sft

Aleph-Alpha/llama-3_1-8b-tfree-hat-dpo

Aleph-Alpha/llama-3_1-8b-tfree-hat-base

Aleph-Alpha/llama-3_1-70b-tfree-hat-sft

Aleph-Alpha/Aleph-Alpha-GermanWeb-Quality-Classifier-BERT

Aleph-Alpha/Aleph-Alpha-GermanWeb-Grammar-Classifier-BERT

Aleph-Alpha/Aleph-Alpha-GermanWeb-Quality-Classifier-fastText

Aleph-Alpha/Aleph-Alpha-GermanWeb-Grammar-Classifier-fastText

datasets 3

Aleph-Alpha/Aleph-Alpha-GermanWeb

Aleph-Alpha/MTBench-German

Aleph-Alpha/ticket-classification

AI & ML interests

Recent Activity

Team members 139

Collections 4

models 29 Sort: Recently updated

datasets 3 Sort: Recently updated

models 29

datasets 3