IMatrix quants.

FIXED with pull request https://github.com/ggml-org/llama.cpp/pull/12957, tested and working

EDIT 22-04-2025: deleted old GGUFs and uploaded one with new PR from https://github.com/ggml-org/llama.cpp/pull/13021, this allows you to test with the mainline (non-patched) version of llama.cpp, also with LMStudio. This version is NOT an imatrix quant since I wanted it up fast, might reupload later with imatrix quants.

If you are testing with LM Studio, replace the Jinja template with this one:

[gMASK]<sop>
{%- if tools -%}
<|system|>
你是一个名为 ChatGLM 的人工智能助手。你是基于智谱 AI 公司训练的语言模型 GLM-4 模型开发的,你的任务是针对用户的问题和要求提供适当的答复和支持。
# 可用工具
{%- for tool in tools %}
    {%- set function = tool.function if tool.get("function") else tool %}
## {{ function.name }}
{{ function | tojson(indent=4, ensure_ascii=False) }}
在调用上述函数时,请使用 Json 格式表示调用的参数。
{%- endfor %}
{%- endif -%}
{%- for msg in messages %}
    {%- if msg.role == 'system' %}
<|system|>
{{ msg.content }}
    {%- endif %}
{%- endfor %}
{%- for message in messages if message.role != 'system' %}
    {%- set role = message['role'] %}
    {%- set content = message['content'] %}
    {%- set thinkcontent = content.split('</think>') %}
    {%- set visible = thinkcontent[-1].strip() %}
    {%- set meta = message.get("metadata", "") %}
    {%- if role == 'user' %}
<|user|>
{{ visible }}
    {%- elif role == 'assistant' and not meta %}
<|assistant|>
{{ visible }}
    {%- elif role == 'assistant' and meta %}
<|assistant|>{{ meta }}
{{ visible }}
    {%- elif role == 'observation' %}
<|observation|>
{{ visible }}
    {%- endif %}
{%- endfor %}
{% if add_generation_prompt %}<|assistant|>\n<think>{% endif %}
Downloads last month
8
GGUF
Model size
9.4B params
Architecture
glm4
Hardware compatibility
Log In to view the estimation

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ilintar/THUDM_GLM-Z1-9B-0414_iGGUF

Quantized
(28)
this model