Any plan to release 120b and 20-30b level models?

#5
by Sunny2038 - opened

It would be great if LongChat could have 120b and 20-30b level models, like gpt-120b, GLM-4.5-Air, or/and gpt-20b, seed-36b, qwen3-30b.

yeah... like 100B A3.3-5.5B and 32B A1-2B. this would be great

also any dense versions plz

also any dense versions plz

Why dense ?

I'm all for Mixture of Experts in size of approximately 30B with about 3-6B active parameters. That seems to be a good trade-off between speed and performance and especially useful for bringing better quality models to those regular PC users who cannot afford more powerful hardware. Recent advancements in this model size suggest this size of AI models has great potential that's yet to be unleashed.

I'm all for Mixture of Experts in size of approximately 30B with about 3-6B active parameters. That seems to be a good trade-off between speed and performance and especially useful for bringing better quality models to those regular PC users who cannot afford more powerful hardware. Recent advancements in this model size suggest this size of AI models has great potential that's yet to be unleashed.

Ohh yeah i got it, thanks.

TBH I'd rather have them try something more experimental, like a bitnet (or at least partially bitnet) model for cheaper deployment, or an 'alternate' attention scheme for longer context.

We have tons of mid-sized options now. I suppose a mid-sized MLA model with a long context would be interesting though, as everything else seems to use regular GQA.

TBH I'd rather have them try something more experimental, like a bitnet (or at least partially bitnet) model for cheaper deployment, or an 'alternate' attention scheme for longer context.

We have tons of mid-sized options now. I suppose a mid-sized MLA model with a long context would be interesting though, as everything else seems to use regular GQA.

Yeah, I agree lets hope something comes out with a good trim like 2bit and native fp4 like from Apple foundation model and openai gpt-oss

Yeah, I agree lets hope something comes out with a good trim like 2bit and native fp4 like from Apple foundation model and openai gpt-oss

We already got a native 2-bit QAT 300B, with Baidu 300B! Yeah, more native QAT would be awesome.

No one's converted the 300B weights out of PaddlePaddle though, as far as I know.

PaddlePaddle

To be honest, no one has a clue about this either. Maybe the Chinese speaking folks have their own llama.cpp alternatives and have been running them on their CPUs? Who knows? 😎

Sign up or log in to comment