unsloth/DeepSeek-R1-GGUF · IQ1_S模型合并后部署于ollama上，推理生成效果差

gaozj

13 days ago

我使用在ollama中部署了IQ1_S模型，最后使用OpenAI API的端点进行推理，推理会生成许多与问题无关的回答（生成效果差），这是否是量化模型的通病？我在部署其他量化模型也发现模型会生成过多无关的回答。

shimmyshimmer

Unsloth AI org 10 days ago

Are you sure youre using the correct chat template? Did you set temp to 0.6?

IanAndHis314Pies

about 14 hours ago

LLM newbie here,
I ran into similar issue which I solved it by setting up Template in the Ollama Modfile, which is used when running the ollama create <custom_model_name> -f <ModelFile name>.

Here's the ModelFile I used (the template and parameters are copied from https://ollama.com/library/deepseek-r1

FROM .\llama-b4754-bin-win-cuda-cu12.4-x64\DeepSeek-R1-GGUF\DeepSeek-R1-Merged\DeepSeek-R1-UD-IQ1_S-merged.gguf

TEMPLATE """{{- if .System }}{{ .System }}{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1}}
{{- if eq .Role "user" }}<｜User｜>{{ .Content }}
{{- else if eq .Role "assistant" }}<｜Assistant｜>{{ .Content }}{{- if not $last }}<｜end▁of▁sentence｜>{{- end }}
{{- end }}
{{- if and $last (ne .Role "assistant") }}<｜Assistant｜>{{- end }}
{{- end }}
"""

PARAMETER stop "<|begin_of_sentence|>"
PARAMETER stop "<|end_of_sentence|>"
PARAMETER stop "<|User|>"
PARAMETER stop "<|Assistant"