Raw CommonCrawl crawls, annotated with Creative Commons license information
Bram Vanroy PRO
BramVanroy
AI & ML interests
Artificial intelligence, natural language processing, computational linguistics
Organizations
🐐 GEITje 7B ultra 🤖
SFT and DPO models for GEITje 7B Ultra, including the datasets used to train them.
Dutch Simplification
Leesplank 2023-2024
BLEURT
A back-up on the hub of the BLEURT models (https://github.com/google-research/bleurt/blob/master/checkpoints.md)
Fietje 2
An open and efficient LLM for Dutch based on phi-2
SFT & RL datasets for Dutch
Multilingual text-to-AMR
Llama 2 & Falcon finetunes
Older finetunes - not recommended! Use GEITje 7B Ultra instead
-
Language Resources for Dutch Large Language Modelling
Paper • 2312.12852 • Published • 9 -
BramVanroy/Llama-2-13b-chat-dutch
Text Generation • 13B • Updated • 1.79k • 19 -
BramVanroy/llama2-13b-ft-mc4_nl_cleaned_tiny
Text Generation • 13B • Updated • 23 • 4 -
BramVanroy/falcon-40b-ft-alpaca-dolly-dutch
Text Generation • 41B • Updated • 26 • 4
CommonCrawl-Creative Commons (C5)
Raw CommonCrawl crawls, annotated with Creative Commons license information
Fietje 2
An open and efficient LLM for Dutch based on phi-2
🐐 GEITje 7B ultra 🤖
SFT and DPO models for GEITje 7B Ultra, including the datasets used to train them.
SFT & RL datasets for Dutch
Dutch Simplification
Multilingual text-to-AMR
Leesplank 2023-2024
Llama 2 & Falcon finetunes
Older finetunes - not recommended! Use GEITje 7B Ultra instead
-
Language Resources for Dutch Large Language Modelling
Paper • 2312.12852 • Published • 9 -
BramVanroy/Llama-2-13b-chat-dutch
Text Generation • 13B • Updated • 1.79k • 19 -
BramVanroy/llama2-13b-ft-mc4_nl_cleaned_tiny
Text Generation • 13B • Updated • 23 • 4 -
BramVanroy/falcon-40b-ft-alpaca-dolly-dutch
Text Generation • 41B • Updated • 26 • 4
BLEURT
A back-up on the hub of the BLEURT models (https://github.com/google-research/bleurt/blob/master/checkpoints.md)