grok-2

File size: 2,050 Bytes

f8d4a92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f03a63f
 
 
 
f8d4a92
 
04deaba
f8d4a92
04deaba
f8d4a92
1b15dde
f8d4a92
 
 
2880060
f8d4a92
 
04deaba
f8d4a92
04deaba
f8d4a92
04deaba
f8d4a92
 
 
 
 
 
 
 
 
 
 
1b15dde
f8d4a92
1b15dde
f8d4a92
 
 
2880060
f8d4a92
1b15dde
f8d4a92
 
 
2880060
f8d4a92
 
 
 
 
 
 
f03a63f
 
 
f8d4a92


language:

- en
license: other
library_name: sglang
pipeline_tag: text-generation
tags:
- grok-2
- xai
- sglang
- inference
- triton
base_model: xai-org/grok-2
model-index:
- name: grok-2
results: []

# Grok 2

This repository contains the weights of Grok 2, a model trained and used at xAI in 2024.

- License: Grok 2 Community License Agreement (./LICENSE)
- Ownership: xAI (no changes to license or weights in this PR)

## Weights

- Download from the Hub (≈500 GB total; 42 files):
  hf download xai-org/grok-2 --local-dir /local/grok-2
If you see transient errors, retry until it completes.

## Hardware and Parallelism

- This checkpoint is configured for TP=8.
- Recommended: 8× GPUs (each > 40 GB memory).

## Serving with SGLang (>= v0.5.1)

Install SGLang from https://github.com/sgl-project/sglang/

Launch an inference server:
python3 -m sglang.launch_server \
  --model /local/grok-2 \
  --tokenizer-path /local/grok-2/tokenizer.tok.json \
  --tp 8 \
  --quantization fp8 \
  --attention-backend triton
Send a test request (chat template aware):
python3 -m sglang.test.send_one --prompt \
"Human: What is your name?<|separator|>\n\nAssistant:"
You should see the model respond with its name: “Grok”.

More ways to send requests:

- https://docs.sglang.ai/basic_usage/send_request.html
- Note: this is a post-trained model; use the correct chat template:
https://github.com/sgl-project/sglang/blob/97a38.../tiktoken_tokenizer.py#L106

## Community Usage (Examples)

- Local-only serving behind VPN/Nginx allowlist
- Log and audit inference (timestamps and SHA-256 manifests)
- Optional cloud fallback to xAI’s API when local capacity is unavailable

These are usage patterns only; they don’t alter license or weights.

## Limitations and Safety

- Large memory footprint (multi-GPU recommended)
- Follow the Grok 2 Community License
- Redact any sensitive data before inference if routing via cloud services

## License

Weights are licensed under the Grok 2 Community License Agreement (./LICENSE).

خياراتك الآن