Intel
/

Qwen3-Coder-480B-A35B-Instruct-int4-AutoRound

4-bit precision

Model card Files Files and versions

wenhuach commited on 25 days ago

Commit

d97e046

·

verified ·

1 Parent(s): 0e8eae2

update vllm usage

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -15,6 +15,11 @@ Please follow the license of the original model.
 ## How To Use
 **INT4 Inference on CPU/Intel GPU/CUDA**
 ~~~python

 ## How To Use
+**vLLM usage**
+~~~bash
+vllm serve Intel/Qwen3-Coder-480B-A35B-Instruct-int4-AutoRound --tensor-parallel-size 4 --max-model-len 65536
+~~~
 **INT4 Inference on CPU/Intel GPU/CUDA**
 ~~~python