Deploy gpt-oss models in your own AWS account using vLLM and Tensorfuse

#37

by agam30 - opened 16 days ago

Discussion

agam30

16 days ago

Hi all,

we have released a guide to deploy openai's latest oss models in your own AWS account. What's included:

Optimized dockerfile with the latest vllm-openai:gptoss for both 20b and 120b models
we achieved 240 tps on 1xH100 with 20b model and 200tps on 8xH100 with 120b model
Served with full context length of 130k
Follow the guide to run it in your AWS account: https://tensorfuse.io/docs/guides/modality/text/openai_oss

get started with tensorfuse here: https://app.tensorfuse.io/

samagra14

16 days ago

Would be awesome to release metrics on Consumer hardware.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment