Is it mandatory to install flash-attention for GPT-OSS?

#112

by xiaotianyu2025 - opened 10 days ago

xiaotianyu2025

10 days ago

Is it mandatory to install flash-attention for GPT-OSS? Installing it causes the 3090 server to freeze for an hour, making it impossible to access the server. After it ends, pip list doesn't show flash-attention either.

Running:
"git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
git checkout v3
pip install --upgrade pip setuptools wheel
pip install . --use-pep517 "
and
"MAKEFLAGS="-j2" pip install . --use-pep517 "
always causes the server to freeze.

xiaotianyu2025

9 days ago

vllm==0.10.1+gptoss transformers==4.55.0 python3.12 cuda12.0

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment