sh2orc commited on
Commit
2c48ccf
·
verified ·
1 Parent(s): 7e85a54

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -12,7 +12,7 @@ tags:
12
  - FP8
13
  ---
14
 
15
- # Qwen3-32B-FP8-dynamic
16
 
17
  ## Model Overview
18
  - **Model Architecture:** Qwen3ForCausalLM
@@ -30,7 +30,7 @@ tags:
30
  - **Out-of-scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws).
31
  - **Release Date:** 05/02/2025
32
  - **Version:** 1.0
33
- - **Model Developers:** RedHat (Neural Magic)
34
 
35
  ### Model Optimizations
36
 
@@ -51,7 +51,7 @@ This model can be deployed efficiently using the [vLLM](https://docs.vllm.ai/en/
51
  from vllm import LLM, SamplingParams
52
  from transformers import AutoTokenizer
53
 
54
- model_id = "RedHatAI/Qwen3-32B-FP8-dynamic"
55
  number_gpus = 1
56
  sampling_params = SamplingParams(temperature=0.6, top_p=0.95, top_k=20, min_p=0, max_tokens=256)
57
 
@@ -128,7 +128,7 @@ The model was evaluated on the OpenLLM leaderboard tasks (version 1), using [lm-
128
  ```
129
  lm_eval \
130
  --model vllm \
131
- --model_args pretrained="RedHatAI/Qwen3-32B-FP8-dynamic",dtype=auto,gpu_memory_utilization=0.5,max_model_len=8192,enable_chunk_prefill=True,tensor_parallel_size=1 \
132
  --tasks openllm \
133
  --apply_chat_template\
134
  --fewshot_as_multiturn \
 
12
  - FP8
13
  ---
14
 
15
+ # Qwen3-32B-FP8-Dynamic
16
 
17
  ## Model Overview
18
  - **Model Architecture:** Qwen3ForCausalLM
 
30
  - **Out-of-scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws).
31
  - **Release Date:** 05/02/2025
32
  - **Version:** 1.0
33
+ - **Model Developers:** BC Card, Redhat
34
 
35
  ### Model Optimizations
36
 
 
51
  from vllm import LLM, SamplingParams
52
  from transformers import AutoTokenizer
53
 
54
+ model_id = "BCCard/Qwen3-32B-FP8-dynamic"
55
  number_gpus = 1
56
  sampling_params = SamplingParams(temperature=0.6, top_p=0.95, top_k=20, min_p=0, max_tokens=256)
57
 
 
128
  ```
129
  lm_eval \
130
  --model vllm \
131
+ --model_args pretrained="BCCard/Qwen3-32B-FP8-dynamic",dtype=auto,gpu_memory_utilization=0.5,max_model_len=8192,enable_chunk_prefill=True,tensor_parallel_size=1 \
132
  --tasks openllm \
133
  --apply_chat_template\
134
  --fewshot_as_multiturn \