Model that is fine-tuned in 4-bit precision using QLoRA on timdettmers/openassistant-guanaco and sharded to be used on a free Google Colab instance that can be loaded with 4bits.

It can be easily imported using the AutoModelForCausalLM class from transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
          "guardrail/llama-2-7b-guanaco-instruct-sharded",
          load_in_4bit=True)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
Downloads last month
1,954
Safetensors
Model size
6.74B params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for guardrail/llama-2-7b-guanaco-instruct-sharded

Adapters
1 model
Finetunes
2 models

Dataset used to train guardrail/llama-2-7b-guanaco-instruct-sharded

Space using guardrail/llama-2-7b-guanaco-instruct-sharded 1