--- base_model: Qwen/Qwen2.5-32B-Instruct language: - en license: apache-2.0 pipeline_tag: text-generation library_name: furiosa-llm tags: - furiosa-ai - qwen - qwen-2.5 --- # Model Overview - **Model Architecture:** Qwen2 - **Input:** Text - **Output:** Text - **Model Optimizations:** - **Context Length:** 32k tokens - Maximum Prompt Length: 32768 tokens - Maximum Generation Length: 32768 tokens - **Intended Use Cases:** Intended for commercial and non-commercial use. Same as [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct), this models is intended for assistant-like chat. - **Release Date:** 08/21/2025 - **Version:** v2025.3 - **License(s):** [Apache 2.0 License](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE) - **Supported Inference Engine(s):** Furiosa LLM - **Supported Hardware Compatibility:** FuriosaAI RNGD - **Preferred Operating System(s):** Linux ## Description: This model is the pre-compiled version of the [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct), which is an auto-regressive language model that uses an optimized transformer architecture. ## Usage To run this model with [Furiosa-LLM](https://developer.furiosa.ai/latest/en/furiosa_llm/intro.html), follow the example command below after [installing Furiosa-LLM and its prerequisites](https://developer.furiosa.ai/latest/en/getting_started/furiosa_llm.html#installing-furiosa-llm). ```sh furiosa-llm serve furiosa-ai/Qwen2.5-32B-Instruct ```