Adding `safetensors` variant of this model
#27 opened about 1 year ago
by
SFconvertbot

Do we have a plan on posting the evaluation results to `open_llm_leaderboard`
3
#26 opened over 1 year ago
by
mpsk

Context length schedule and performance
3
#25 opened over 1 year ago
by
baffo32
Adding `safetensors` variant of this model
1
#24 opened over 1 year ago
by
SFconvertbot

HF version
#23 opened over 1 year ago
by
edmond
Pretraining hyperparameters?
#21 opened over 1 year ago
by
PY007

How to run on colab's CPU?
1
#20 opened over 1 year ago
by
deepakkaura26

Qlora finetuning
1
#19 opened over 1 year ago
by
TinyPixel
Why need get_mup_param_groups instead of default one in Huggingface?
#18 opened over 1 year ago
by
sanqiang
No Cuda Information / nvidia-smi / nvtop
1
#17 opened over 1 year ago
by
nudelbrot
How to reproduce quantized memory usage?
6
#16 opened over 1 year ago
by
tarasglek
What is the inference time? On my Apple M1 Max completions take > 6 min
9
#15 opened over 1 year ago
by
vedtam
Fine-tuning on coding tasks
1
#14 opened over 1 year ago
by
sgaseretto
Your 3b model is very exciting and proves that data improvement works!
#13 opened over 1 year ago
by
win10

Any plans on releasing GPTQ or GGML versions of this?
4
#12 opened over 1 year ago
by
FriendlyVisage
why we can not make this fully HF ready?
8
#11 opened over 1 year ago
by
CUIGuy
LoraConfig's target_modul with peft ?
8
#10 opened over 1 year ago
by
Handgun1773
include fastchat-t5 in the benchmark which is also 3B parameter
#9 opened over 1 year ago
by
vasilee
Recommendations for additional pretraining?
4
#8 opened over 1 year ago
by
ZQ-Dev
