LFM2.5-VL-1.6B is my daily driver for security camera analysis — 51 tokens/sec with full Metal GPU acceleration, and it just works
❤️ 7
#7 opened 7 days ago
by
SharpAI
Tokenizer mismatch when serving with vLLM
2
#6 opened 25 days ago
by
one-man-won
CPU-only inference broken with latest llama.cpp?
🤝 1
#4 opened about 1 month ago
by
dinerburger
Ollama Support
6
#2 opened about 2 months ago
by
yqchen-sci