About the uploaded model

Setup

  • You can run the quantized model with these steps:

  • Check requirements from the original. In particular, check python, cuda, and transformers versions.

  • Make sure that you have installed quantization related packages.

pip install bitsandbytes>=0.39.0
pip install --upgrade accelerate transformers
  • Load & run the model.
from transformers import AutoProcessor, AutoModelForVision2Seq
from huggingface_hub import hf_hub_download
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"


model = AutoModelForVision2Seq.from_pretrained('hassenhamdi/granite-vision-3.1-2b-preview-4bit', trust_remote_code=True).to(device)
tokenizer = AutoProcessor.from_pretrained('ibm-granite/granite-vision-3.1-2b-preview')


# prepare image and text prompt, using the appropriate prompt template

img_path = hf_hub_download(repo_id=model_path, filename='example.png')

conversation = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": img_path},
            {"type": "text", "text": "What is the highest scoring model on ChartQA and what is its score?"},
        ],
    },
]
inputs = processor.apply_chat_template(
    conversation,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt"
).to(device)


# autoregressively complete prompt
output = model.generate(**inputs, max_new_tokens=100)
print(processor.decode(output[0], skip_special_tokens=True))

Configurations

  • The configuration info are in config.json.
Downloads last month
195
Safetensors
Model size
1.59B params
Tensor type
F32
FP16
U8
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for hassenhamdi/granite-vision-3.1-2b-preview-4bit