xai-org
/

grok-2

Model card Files Files and versions

grok-2 / README.md

up2u's picture

Update README.md

a7d1ab7 verified 4 days ago

|

2.54 kB

	language:

	- en
	license: other
	library_name: sglang
	pipeline_tag: text-generation
	tags:
	- grok-2
	- xai
	- sglang
	- inference
	- triton
	base_model: xai-org/grok-2
	model-index:
	- name: grok-2
	results: []

	# Grok 2

	This repository contains the weights of Grok 2, a model trained and used at xAI in 2024.

	- License: Grok 2 Community License Agreement (./LICENSE)
	- Ownership: xAI (this document does not change license or weights)

	## Weights

	Download from the Hub (≈500 GB total; 42 files):
	hf download xai-org/grok-2 --local-dir /local/grok-2
	If you see transient errors, retry until it completes. On success, you should see 42 files (~500 GB).

	## Hardware and Parallelism

	- This checkpoint is configured for TP=8.
	- Recommended: 8× GPUs (each > 40 GB memory).

	## Serving with SGLang (>= v0.5.1)

	Install SGLang from https://github.com/sgl-project/sglang/

	Launch an inference server:
	python3 -m sglang.launch_server \
	--model /local/grok-2 \
	--tokenizer-path /local/grok-2/tokenizer.tok.json \
	--tp 8 \
	--quantization fp8 \
	--attention-backend triton
	Send a test request (chat template aware):
	python3 -m sglang.test.send_one --prompt \
	"Human: What is your name?<\|separator\|>\n\nAssistant:"
	You should see the model respond with its name: “Grok”.

	More ways to send requests:

	- https://docs.sglang.ai/basic_usage/send_request.html

	Note: this is a post-trained model; use the correct chat template:

	- https://github.com/sgl-project/sglang/blob/97a38ee85ba62e268bde6388f1bf8edfe2ca9d76/python/sglang/srt/tokenizer/
	tiktoken_tokenizer.py#L106

	## Community Usage (Examples)

	- Local-only serving behind VPN/Nginx allowlist
	- Log and audit inference (timestamps and SHA‑256 manifests)
	- Optional fallback to xAI’s API when local capacity is unavailable

	These examples describe usage patterns only; they do not alter license or weights.

	## Limitations and Safety

	- Large memory footprint (multi-GPU recommended)
	- Follow the Grok 2 Community License
	- Redact any sensitive data before inference if routing via cloud services

	## License

	Weights are licensed under the Grok 2 Community License Agreement (./LICENSE).

	تعليق PR مقترح (قصير ومحايد)

	- Summary: Fix model card metadata (YAML at top), remove duplicated sections, fence code blocks, and keep license/ownership
	unchanged.
	- Scope: README.md only. No weights or license changes.
	- Rationale: Resolves Hub YAML warning and makes SGLang instructions copy‑paste runnable.
	- Notes: URLs unbroken; model-index.results properly nested.