prudant commited on
Commit
e11ec3b
·
verified ·
1 Parent(s): 0d31b8a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -1
README.md CHANGED
@@ -1,8 +1,23 @@
 
 
 
 
 
 
 
 
 
1
 
2
  # prudant/Qwen3-Reranker-4B-seq-cls-vllm-fixed-W4A16_ASYM
3
 
4
  This is a compressed version of danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixed using llm-compressor with the following scheme: W4A16_ASYM
5
 
 
 
 
 
 
 
6
  ## Model Details
7
 
8
  - **Original Model**: danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixed
@@ -10,4 +25,4 @@ This is a compressed version of danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixe
10
  - **Compression Libraries**: [llm-compressor](https://github.com/vllm-project/llm-compressor)
11
  - **Calibration Dataset**: ultrachat_200k (512 samples)
12
  - **Optimized For**: Inference with vLLM
13
- - **License**: same as original model
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - es
5
+ - en
6
+ base_model:
7
+ - Qwen/Qwen3-Reranker-4B
8
+ pipeline_tag: text-ranking
9
+ ---
10
 
11
  # prudant/Qwen3-Reranker-4B-seq-cls-vllm-fixed-W4A16_ASYM
12
 
13
  This is a compressed version of danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixed using llm-compressor with the following scheme: W4A16_ASYM
14
 
15
+ ## Serving
16
+
17
+ ``python3 -m vllm.entrypoints.openai.api_server --model 'dolfsai/Qwen3-Reranker-4B-seq-cls-vllm-W4A16_ASYM' --task classify``
18
+
19
+ **Important**: You MUST read the following guide for correct usage of this model here [Guide](https://github.com/vllm-project/vllm/pull/19260)
20
+
21
  ## Model Details
22
 
23
  - **Original Model**: danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixed
 
25
  - **Compression Libraries**: [llm-compressor](https://github.com/vllm-project/llm-compressor)
26
  - **Calibration Dataset**: ultrachat_200k (512 samples)
27
  - **Optimized For**: Inference with vLLM
28
+ - **License**: same as original model