prudant's picture
Update README.md
e11ec3b verified
metadata
license: apache-2.0
language:
  - es
  - en
base_model:
  - Qwen/Qwen3-Reranker-4B
pipeline_tag: text-ranking

prudant/Qwen3-Reranker-4B-seq-cls-vllm-fixed-W4A16_ASYM

This is a compressed version of danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixed using llm-compressor with the following scheme: W4A16_ASYM

Serving

python3 -m vllm.entrypoints.openai.api_server --model 'dolfsai/Qwen3-Reranker-4B-seq-cls-vllm-W4A16_ASYM' --task classify

Important: You MUST read the following guide for correct usage of this model here Guide

Model Details

  • Original Model: danielchalef/Qwen3-Reranker-4B-seq-cls-vllm-fixed
  • Quantization Method: AWQ
  • Compression Libraries: llm-compressor
  • Calibration Dataset: ultrachat_200k (512 samples)
  • Optimized For: Inference with vLLM
  • License: same as original model