Update README.md
#16 opened 3 days ago
by
shubham001213
Does DeepSeek-Llama-70B support tensor parallelism for multi-GPU inference?
1
#14 opened 10 days ago
by
Merk0701234
weight files naming is not regular rule
#13 opened 18 days ago
by
haili-tian
How much vram do you need?
8
#12 opened 21 days ago
by
hyun10
Upload IMG_4815.jpeg
#11 opened 24 days ago
by
H3mzy11

Amazon Sagemaker deployment failing with CUDA OutOfMemory error
3
#10 opened 27 days ago
by
neelkapadia
<thinking> is the proper tag?
4
#8 opened 27 days ago
by
McUH
Add pipeline tag
#7 opened about 1 month ago
by
nielsr

SFT (Non-RL) distillation is this good on a sub-100B model?
3
#2 opened about 1 month ago
by
KrishnaKaasyap