File size: 2,050 Bytes
f8d4a92 f03a63f f8d4a92 04deaba f8d4a92 04deaba f8d4a92 1b15dde f8d4a92 2880060 f8d4a92 04deaba f8d4a92 04deaba f8d4a92 04deaba f8d4a92 1b15dde f8d4a92 1b15dde f8d4a92 2880060 f8d4a92 1b15dde f8d4a92 2880060 f8d4a92 f03a63f f8d4a92 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
language:
- en
license: other
library_name: sglang
pipeline_tag: text-generation
tags:
- grok-2
- xai
- sglang
- inference
- triton
base_model: xai-org/grok-2
model-index:
- name: grok-2
results: []
# Grok 2
This repository contains the weights of Grok 2, a model trained and used at xAI in 2024.
- License: Grok 2 Community License Agreement (./LICENSE)
- Ownership: xAI (no changes to license or weights in this PR)
## Weights
- Download from the Hub (≈500 GB total; 42 files):
hf download xai-org/grok-2 --local-dir /local/grok-2
If you see transient errors, retry until it completes.
## Hardware and Parallelism
- This checkpoint is configured for TP=8.
- Recommended: 8× GPUs (each > 40 GB memory).
## Serving with SGLang (>= v0.5.1)
Install SGLang from https://github.com/sgl-project/sglang/
Launch an inference server:
python3 -m sglang.launch_server \
--model /local/grok-2 \
--tokenizer-path /local/grok-2/tokenizer.tok.json \
--tp 8 \
--quantization fp8 \
--attention-backend triton
Send a test request (chat template aware):
python3 -m sglang.test.send_one --prompt \
"Human: What is your name?<|separator|>\n\nAssistant:"
You should see the model respond with its name: “Grok”.
More ways to send requests:
- https://docs.sglang.ai/basic_usage/send_request.html
- Note: this is a post-trained model; use the correct chat template:
https://github.com/sgl-project/sglang/blob/97a38.../tiktoken_tokenizer.py#L106
## Community Usage (Examples)
- Local-only serving behind VPN/Nginx allowlist
- Log and audit inference (timestamps and SHA-256 manifests)
- Optional cloud fallback to xAI’s API when local capacity is unavailable
These are usage patterns only; they don’t alter license or weights.
## Limitations and Safety
- Large memory footprint (multi-GPU recommended)
- Follow the Grok 2 Community License
- Redact any sensitive data before inference if routing via cloud services
## License
Weights are licensed under the Grok 2 Community License Agreement (./LICENSE).
خياراتك الآن
|