Intel
/

Qwen3-235B-A22B-Instruct-2507-int4-AutoRound

4-bit precision

Model card Files Files and versions

weiweiz1 commited on 24 days ago

Commit

71abea5

·

verified ·

1 Parent(s): 3c3f869

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -15,6 +15,14 @@ Please follow the license of the original model.
 ## How To Use
 **INT4 Inference on CPU/Intel GPU/CUDA**
 ~~~python

 ## How To Use
+**vLLM usage**
+~~~bash
+vllm serve Intel/Qwen3-235B-A22B-Thinking-2507-int4-AutoRound --tensor-parallel-size 4   --max-model-len 32768
+~~~
 **INT4 Inference on CPU/Intel GPU/CUDA**
 ~~~python