---
base_model: Qwen/Qwen2.5-32B-Instruct
language:
  - en
license: apache-2.0
pipeline_tag: text-generation
library_name: furiosa-llm
tags:
  - furiosa-ai
  - qwen
  - qwen-2.5
---
# Model Overview
- **Model Architecture:** Qwen2
  - **Input:** Text
  - **Output:** Text
- **Model Optimizations:**
- **Context Length:** 32k tokens
  - Maximum Prompt Length: 32768 tokens
  - Maximum Generation Length: 32768 tokens
- **Intended Use Cases:** Intended for commercial and non-commercial use. Same as [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct), this models is intended for assistant-like chat.
- **Release Date:** 08/21/2025
- **Version:** v2025.3
- **License(s):** [Apache 2.0 License](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE)
- **Supported Inference Engine(s):** Furiosa LLM
- **Supported Hardware Compatibility:** FuriosaAI RNGD
- **Preferred Operating System(s):** Linux


## Description:
This model is the pre-compiled version of the [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct),
which is an auto-regressive language model that uses an optimized transformer architecture.

## Usage

To run this model with [Furiosa-LLM](https://developer.furiosa.ai/latest/en/furiosa_llm/intro.html),
follow the example command below after
[installing Furiosa-LLM and its prerequisites](https://developer.furiosa.ai/latest/en/getting_started/furiosa_llm.html#installing-furiosa-llm).

```sh
furiosa-llm serve furiosa-ai/Qwen2.5-32B-Instruct
```