Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,23 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
2 |
# prudant/Qwen3-Reranker-4B-seq-cls-vllm-fixed-W4A16_ASYM
|
3 |
|
4 |
This is a compressed version of danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixed using llm-compressor with the following scheme: W4A16_ASYM
|
5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
## Model Details
|
7 |
|
8 |
- **Original Model**: danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixed
|
@@ -10,4 +25,4 @@ This is a compressed version of danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixe
|
|
10 |
- **Compression Libraries**: [llm-compressor](https://github.com/vllm-project/llm-compressor)
|
11 |
- **Calibration Dataset**: ultrachat_200k (512 samples)
|
12 |
- **Optimized For**: Inference with vLLM
|
13 |
-
- **License**: same as original model
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- es
|
5 |
+
- en
|
6 |
+
base_model:
|
7 |
+
- Qwen/Qwen3-Reranker-4B
|
8 |
+
pipeline_tag: text-ranking
|
9 |
+
---
|
10 |
|
11 |
# prudant/Qwen3-Reranker-4B-seq-cls-vllm-fixed-W4A16_ASYM
|
12 |
|
13 |
This is a compressed version of danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixed using llm-compressor with the following scheme: W4A16_ASYM
|
14 |
|
15 |
+
## Serving
|
16 |
+
|
17 |
+
``python3 -m vllm.entrypoints.openai.api_server --model 'dolfsai/Qwen3-Reranker-4B-seq-cls-vllm-W4A16_ASYM' --task classify``
|
18 |
+
|
19 |
+
**Important**: You MUST read the following guide for correct usage of this model here [Guide](https://github.com/vllm-project/vllm/pull/19260)
|
20 |
+
|
21 |
## Model Details
|
22 |
|
23 |
- **Original Model**: danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixed
|
|
|
25 |
- **Compression Libraries**: [llm-compressor](https://github.com/vllm-project/llm-compressor)
|
26 |
- **Calibration Dataset**: ultrachat_200k (512 samples)
|
27 |
- **Optimized For**: Inference with vLLM
|
28 |
+
- **License**: same as original model
|