Model Request

#1320
by mshojaei77 - opened

could you please quant this model
mshojaei77/gemma-3n-E4B-persian

It's queued! :D

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#gemma-3n-E4B-persian-GGUF for quants to appear.

This model is already quantized to uint8 and so can't be quantized into a GGUF:

WARNING:hf-to-gguf:ignore token 262399: id is out of range, max=262143
INFO:hf-to-gguf:token_embd.weight,                 torch.float16 --> F16, shape = {2048, 262144}
INFO:hf-to-gguf:blk.0.altup_correct_scale.weight,  torch.float16 --> F32, shape = {2048}
INFO:hf-to-gguf:blk.0.altup_correct_coef.weight,   torch.float16 --> F32, shape = {4, 4}
INFO:hf-to-gguf:blk.0.altup_router.weight,         torch.float16 --> F16, shape = {2048, 4}
INFO:hf-to-gguf:blk.0.altup_predict_coef.weight,   torch.float16 --> F32, shape = {4, 16}
INFO:hf-to-gguf:blk.0.altup_router_norm.weight,    torch.float32 --> F32, shape = {2048}
INFO:hf-to-gguf:blk.0.attn_norm.weight,            torch.float32 --> F32, shape = {2048}
WARNING:hf-to-gguf:Cannot find destination type matching torch.uint8: Using F16
INFO:hf-to-gguf:blk.0.laurel_l.weight,             torch.uint8 --> F16, shape = {1, 65536}
Traceback (most recent call last):
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 8833, in <module>
    main()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 8827, in main
    model_instance.write()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 441, in write
    self.prepare_tensors()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 298, in prepare_tensors
    for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 5210, in modify_tensors
    return super().modify_tensors(data_torch, name, bid)
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 5064, in modify_tensors
    return [(self.map_tensor_name(name), data_torch)]
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 257, in map_tensor_name
    raise ValueError(f"Can not map tensor {name!r}")
ValueError: Can not map tensor 'model.layers.0.laurel.linear_left.weight.absmax'
job finished, status 1
job-done<0 gemma-3n-E4B-persian noquant 1>

error/1 ValueError Can not map tensor '
https://huggingface.co/mshojaei77/gemma-3n-E4B-persia

I just checked config.yaml of the model and it indeed is already bitsandbytes quantized:

"quantization_config": {
  "bnb_4bit_compute_dtype": "float16",
  "bnb_4bit_quant_type": "nf4",
  "bnb_4bit_use_double_quant": true,
  "llm_int8_enable_fp32_cpu_offload": false,
  "llm_int8_has_fp16_weight": false,
  "llm_int8_skip_modules": null,
  "llm_int8_threshold": 6.0,
  "load_in_4bit": true,
  "load_in_8bit": false,
  "quant_method": "bitsandbytes"
},

I noticed that you are the author of this model so maybe it would be possible to upload an unquantized version so it can be convearted into GGUF?

Sign up or log in to comment