wenhuach commited on
Commit
d97e046
·
verified ·
1 Parent(s): 0e8eae2

update vllm usage

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -15,6 +15,11 @@ Please follow the license of the original model.
15
 
16
  ## How To Use
17
 
 
 
 
 
 
18
  **INT4 Inference on CPU/Intel GPU/CUDA**
19
 
20
  ~~~python
 
15
 
16
  ## How To Use
17
 
18
+ **vLLM usage**
19
+ ~~~bash
20
+ vllm serve Intel/Qwen3-Coder-480B-A35B-Instruct-int4-AutoRound --tensor-parallel-size 4 --max-model-len 65536
21
+ ~~~
22
+
23
  **INT4 Inference on CPU/Intel GPU/CUDA**
24
 
25
  ~~~python