Fine tune 120b at 8 H100s getting cuda OOM error

#117
by jinxu88 - opened

I am using this script (https://github.com/huggingface/gpt-oss-recipes/blob/main/README.md) for tuning got oss 120b. The OpenAI blog (https://cookbook.openai.com/articles/gpt-oss/fine-tune-transfomers) mentioned it is doable on single H100, but I kept getting OOM. Any one successfully fine tuned it on H100?

The case you linked seems to SFT 20b not 120b...

I am interested in hearing if anyone has managed a successful fine tune run for 120b on H100, as the below model card mentioned. The GitHub link was only provided as a reference, but it did not work for the 120B model on H100.

Model card mentioned:

Fine-tuning
Both gpt-oss models can be fine-tuned for a variety of specialized use cases.
This larger model gpt-oss-120b can be fine-tuned on a single H100 node, whereas the smaller gpt-oss-20b can even be fine-tuned on consumer hardware.

Pretty sure it meant QLoRA with deepspeed Z3 and other optimizations.

Sign up or log in to comment