mradermacher/model_requests

15 days ago

could you please quant this model
mshojaei77/gemma-3n-E4B-persian

15 days ago

It's queued! :D

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#gemma-3n-E4B-persian-GGUF for quants to appear.

nicoboss

15 days ago

This model is already quantized to uint8 and so can't be quantized into a GGUF:

WARNING:hf-to-gguf:ignore token 262399: id is out of range, max=262143
INFO:hf-to-gguf:token_embd.weight,                 torch.float16 --> F16, shape = {2048, 262144}
INFO:hf-to-gguf:blk.0.altup_correct_scale.weight,  torch.float16 --> F32, shape = {2048}
INFO:hf-to-gguf:blk.0.altup_correct_coef.weight,   torch.float16 --> F32, shape = {4, 4}
INFO:hf-to-gguf:blk.0.altup_router.weight,         torch.float16 --> F16, shape = {2048, 4}
INFO:hf-to-gguf:blk.0.altup_predict_coef.weight,   torch.float16 --> F32, shape = {4, 16}
INFO:hf-to-gguf:blk.0.altup_router_norm.weight,    torch.float32 --> F32, shape = {2048}
INFO:hf-to-gguf:blk.0.attn_norm.weight,            torch.float32 --> F32, shape = {2048}
WARNING:hf-to-gguf:Cannot find destination type matching torch.uint8: Using F16
INFO:hf-to-gguf:blk.0.laurel_l.weight,             torch.uint8 --> F16, shape = {1, 65536}
Traceback (most recent call last):
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 8833, in <module>
    main()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 8827, in main
    model_instance.write()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 441, in write
    self.prepare_tensors()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 298, in prepare_tensors
    for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 5210, in modify_tensors
    return super().modify_tensors(data_torch, name, bid)
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 5064, in modify_tensors
    return [(self.map_tensor_name(name), data_torch)]
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 257, in map_tensor_name
    raise ValueError(f"Can not map tensor {name!r}")
ValueError: Can not map tensor 'model.layers.0.laurel.linear_left.weight.absmax'
job finished, status 1
job-done<0 gemma-3n-E4B-persian noquant 1>

error/1 ValueError Can not map tensor '
https://huggingface.co/mshojaei77/gemma-3n-E4B-persia

nicoboss

15 days ago

I just checked config.yaml of the model and it indeed is already bitsandbytes quantized:

"quantization_config": {
  "bnb_4bit_compute_dtype": "float16",
  "bnb_4bit_quant_type": "nf4",
  "bnb_4bit_use_double_quant": true,
  "llm_int8_enable_fp32_cpu_offload": false,
  "llm_int8_has_fp16_weight": false,
  "llm_int8_skip_modules": null,
  "llm_int8_threshold": 6.0,
  "load_in_4bit": true,
  "load_in_8bit": false,
  "quant_method": "bitsandbytes"
},

nicoboss

15 days ago

I noticed that you are the author of this model so maybe it would be possible to upload an unquantized version so it can be convearted into GGUF?

mradermacher
/

model_requests

Model Request