Error when load model

#1
by lczazu - opened

Hi, When I load this model by

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = config.checkpoint,load_in_4bit = load_in_4bit,
    max_seq_length = config.max_length,
    dtype = dtype,
)

Error happened

image.png

Can you try this out?

model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit"
max_seq_length = 2048
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=model_name,
max_seq_length=max_seq_length,
load_in_4bit=True,
)

image.png

For me, it works fine.

Wow, you're on A800 80GB? So envy you! :) Hope it helps.

Wow, you're on A800 80GB? So envy you! :) Hope it helps.

Have you tried to load him with vllm?
I've encountered an error now.

assert param_data.shape == loaded_weight.shape
AssertionError

I don't know why it happened?

Wow, you're on A800 80GB? So envy you! :) Hope it helps.

Now, I don’t have it anymore. 😅

Wow, you're on A800 80GB? So envy you! :) Hope it helps.

Have you tried to load him with vllm?
I've encountered an error now.

assert param_data.shape == loaded_weight.shape
AssertionError

I don't know why it happened?
I believe I used vLLM per the following step I had.

image.png

As you can see, "use_vllm = True, # use vLLM for fast inference!" :)

Also the training did happened with no problem.
2 questions:

  1. Can I see the whole errors which had "assert param_data.shape == loaded_weight.shape
    AssertionError"?
  2. Can you see the version of vLLM you're using?

FYI
This is the summary of my training:

TrainOutput(global_step=250, training_loss=7.667776655330272e-05, metrics={'train_runtime': 3990.5973, 'train_samples_per_second': 0.063, 'train_steps_per_second': 0.063, 'total_flos': 0.0, 'train_loss': 7.667776655330272e-05})

Sign up or log in to comment