Haili TIAN
haili-tian
AI & ML interests
None yet
Recent Activity
new activity
14 days ago
deepseek-ai/DeepSeek-R1:Lite version for DeepSeek-R1?
new activity
19 days ago
deepseek-ai/DeepSeek-R1-Distill-Llama-70B:weight files naming is not regular rule
new activity
19 days ago
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B:weight files naming is not regular rule
Organizations
None yet
haili-tian's activity
Lite version for DeepSeek-R1?
1
#137 opened 14 days ago
by
haili-tian
weight files naming is not regular rule
#13 opened 19 days ago
by
haili-tian
weight files naming is not regular rule
#29 opened 19 days ago
by
haili-tian
bos_token_id is defined incorrectly
1
#28 opened 19 days ago
by
haili-tian
System Prompt
18
#2 opened about 1 month ago
by
Wanfq

What temp are these expected to be used at?
2
#6 opened about 1 month ago
by
rombodawg

running on local machine
7
#19 opened 27 days ago
by
saidavanam
System Prompt
13
#2 opened about 1 month ago
by
Wanfq

Can not use HF transformers for inference?
#11 opened 4 months ago
by
haili-tian
max_window_layers is 70?
2
#1 opened 5 months ago
by
haili-tian
sliding_window is null?
1
#84 opened 5 months ago
by
haili-tian
Qwen1.5 series, I choose Qwen1.5-32B
#3 opened 9 months ago
by
haili-tian
Qwen1.5-32B?
#4 opened 9 months ago
by
haili-tian