inspire's picture

In a Training Loop 🔄

31 235

inspire PRO

inspirebek

·

AI & ML interests

CUDA out of memory.

Recent Activity

liked a model 2 days ago

HiDolen/Mini-BS-RoFormer-18M

liked a Space 9 days ago

google/embeddinggemma-tuning-lab

liked a Space 15 days ago

wjbmattingly/NuMarkdown-8B-Thinking-Demo

View all activity

Organizations

upvoted a collection about 1 month ago

TranslateGemma

3 items • Updated Jan 15 • 212

upvoted a collection 4 months ago

Nanonets-OCR2

2 items • Updated Oct 13, 2025 • 25

upvoted 3 changelogs 6 months ago

Changelog

Introducing a better Hugging Face CLI

Jul 25, 2025

• 96

Changelog

Trending Papers

Jul 28, 2025

• 106

Changelog

Introducing HF Jobs: Run scalable compute jobs on Hugging Face

Jul 30, 2025

• 202

upvoted a paper 7 months ago

Efficient Agents: Building Effective Agents While Reducing Cost

Paper • 2508.02694 • Published Jul 24, 2025 • 86

upvoted a paper 10 months ago

Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG

Paper • 2410.05983 • Published Oct 8, 2024 • 2

upvoted a collection 10 months ago

Search-R1-v0.2

Exploration with a more stable RL pipeline with outcome-only reward and scaled-up LLMs. https://arxiv.org/abs/2503.09516 • 26 items • Updated Aug 12, 2025 • 5

upvoted a collection 11 months ago

reranking series v2

V2 crispy rerank series • 3 items • Updated Jun 25, 2025 • 25

upvoted 3 collections 12 months ago

DeepSeek R1 (All Versions)

DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 37 items • Updated 4 days ago • 262

Cohere Labs Aya 23

Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated Jul 31, 2025 • 56

Cohere Labs Aya Expanse

Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. • 4 items • Updated Jul 31, 2025 • 42

upvoted 2 collections about 1 year ago

Deepseek Papers

Deepseek papers collection • 29 items • Updated 4 days ago • 323

UzLLM

A collection of Uzbek-adapted LLMs. • 4 items • Updated Dec 4, 2024 • 6

upvoted a paper about 1 year ago

Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding

Paper • 2501.07888 • Published Jan 14, 2025 • 15

upvoted 2 collections about 1 year ago

InternLM3

6 items • Updated Dec 30, 2025 • 30

DeepSeek-VL2

5 items • Updated Nov 27, 2025 • 80

upvoted 3 collections over 1 year ago

LipSync and Face Operations

23 items • Updated Jan 2 • 63

Molmo

Artifacts for open multimodal language models. • 5 items • Updated Dec 23, 2025 • 309

Llama 3.2

Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 27 items • Updated 4 days ago • 68