OOM on 3090

#60

by TheBigBlockPC - opened 15 days ago

15 days ago

I tried running this LLM on my dual 3090 PC but it runs on of memory in a single or even both 3090s. Quantizing uding botsandbytes doesn't work. I use transformers. Can someone help fixing that

entfane

15 days ago

Have you tried MXFP4 precision? It is suggested in their cookbook. That way it should fit in 16GB

TheBigBlockPC

15 days ago

fixed it, here is what i found out: link

TheBigBlockPC changed discussion status to closed 15 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment