Mistral-NeMo-Minitron-8B-ARChitects-Full-bnb-4bit

Model Overview

Mistral-NeMo-Minitron-8B-ARChitects-Full-bnb-4bit is a retrained variant of Nvidia Mistral-NeMo-Minitron-8B-Base, finetuned specifically to solve ARC-AGI tasks. In order to save GPU memory, the embedding and vocabulary size have been reduced to only 77 tokens. The model achieved a score of 53.5 on the ARC-AGI private evaluation set during the Kaggle ARC Prize 2024 Competition. Note that the ARC-AGI public evaluation set was used as training data for this model. Please refer to our paper for more details. For more models tuned for ARC-AGI, check out our model collection.

Finetuning Datasets

This model was finetuned on the following datasets:

the ReArc data set by Michael Hodel
the official ARC Prize evaluation set
the ConceptARC data set

License

This model is released under the NVIDIA Open Model License Agreement.

Usage

This model can be used with the transformers or unsloth packages. For more information on preprocessing the ARC Prize tasks to generate prompts for the model, please refer to our Paper and our github repositiory.

da-fr
/

Mistral-NeMo-Minitron-8B-ARChitects-Full-bnb-4bit

Mistral-NeMo-Minitron-8B-ARChitects-Full-bnb-4bit

Model Overview

Finetuning Datasets

License

Usage

References

Collection including da-fr/Mistral-NeMo-Minitron-8B-ARChitects-Full-bnb-4bit

ARC-AGI models