Update preprocessor_config.json

by TahirC - opened 5 days ago

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

-1

TahirC

5 days ago

Same was Updated in Qwen/Qwen2.5-VL-72B-Instruct repo
Qwen2_5_VLImageProcessor is causing the Unrecognized image processor

Update preprocessor_config.json6e73ff69

Batuhan-A

5 days ago

Im on a linux server and now I cant find the preprocessor_config.json path so what I have done is to create a 'processor = AutoProcessor.from_pretrained("q_processor", trust_remote_code=True)' folder to look at and put "preprocessor_config.json" and "tokenizer_config.json" in it but now I get the error below:

Traceback (most recent call last):
File "/home//OCR_Folder/qwen2.3.py", line 13, in
processor = AutoProcessor.from_pretrained("q_processor", trust_remote_code=True)
File "/home//.local/lib/python3.10/site-packages/transformers/models/auto/processing_auto.py", line 334, in from_pretrained
return processor_class.from_pretrained(
File "/home//.local/lib/python3.10/site-packages/transformers/processing_utils.py", line 1070, in from_pretrained
args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
File "/home//.local/lib/python3.10/site-packages/transformers/processing_utils.py", line 1116, in _get_arguments_from_pretrained
args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
File "/home//.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2052, in from_pretrained
return cls._from_pretrained(
File "/home//.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2090, in _from_pretrained
slow_tokenizer = (cls.slow_tokenizer_class)._from_pretrained(
File "/home//.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2292, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home//.local/lib/python3.10/site-packages/transformers/models/qwen2/tokenization_qwen2.py", line 172, in init
with open(vocab_file, encoding="utf-8") as vocab_handle:
TypeError: expected str, bytes or os.PathLike object, not NoneType

What should I do now?

TahirC

3 days ago

•

edited 3 days ago

i was able to run unsloth/Qwen2.5-VL-3B-Instruct-unsloth-bnb-4bit using the given code

################    IMPORTS     ###################
!pip install --no-deps bitsandbytes accelerate xformers==0.0.29 peft trl triton
!pip install --no-deps cut_cross_entropy unsloth_zoo
!pip install sentencepiece protobuf datasets huggingface_hub hf_transfer
!pip install --no-deps unsloth
!pip install -U transformers
#######################

from huggingface_hub import snapshot_download

# Define repo and PR reference
repo_id = "unsloth/Qwen2.5-VL-3B-Instruct-unsloth-bnb-4bit"
revision = "refs/pr/2"  # Use PR #2

# Download model snapshot
local_dir = snapshot_download(repo_id=repo_id, revision=revision)

# Load model from local directory
from unsloth import FastVisionModel

model, tokenizer = FastVisionModel.from_pretrained(
    local_dir,  # Use the downloaded directory
    load_in_4bit=True,
    use_gradient_checkpointing="unsloth",
)

FastVisionModel.for_inference(model) # Enable for inference!

image = dataset[2]["image"]
instruction = "Write the LaTeX representation for this image."

messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt = True)
inputs = tokenizer(
    image,
    input_text,
    add_special_tokens = False,
    return_tensors = "pt",
).to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer, skip_prompt = True)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128,
                   use_cache = True, temperature = 1.5, min_p = 0.1)

TahirC

3 days ago

Change the repo id and revision and try

repo_id = "unsloth/Qwen2.5-VL-72B-Instruct-bnb-4bit"
revision = "refs/pr/3" # PR reference