xai-org
/

grok-2

Model card Files Files and versions

grok-2 / README.md

lmzheng's picture

Update README.md

1b15dde verified 19 days ago

|

1.08 kB

Usage: Serving with SGLang

Download the weights. You can replace /local/grok-2 with any other folder name you prefer.
```
hf download xai-org/grok-2 --local-dir /local/grok-2
```
You might encounter some errors during the download. Please retry until the download is successful.
If the download succeeds, the folder should contain 42 files and be approximately 500 GB.
Launch a server.

Install the latest SGLang inference engine by following the instructions at https://docs.sglang.ai/get_started/install.html.

Then, you can launch an inference server. This checkpoint is TP=8, so you will need 8 GPUs (each with > 40GB of memory).
```
python3 -m sglang.launch_server --model /local/grok-2 --tokenizer-path /local/grok-2/tokenizer.tok.json --tp 8 --quantization fp8 --attention-backend triton
```

Send a request.

python3 -m sglang.test.send_one --prompt "Human: What is your name? <|separator|>\n\nAssistant:"

Learn more about other ways of sending requests here.