Rajdeep Borgohain's picture

Rajdeep Borgohain

rbgo

·

RajdeepBorgohain

AI & ML interests

Solving language barriers.

Recent Activity

upvoted a paper 12 days ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

reacted to Bils's post with 🔥 24 days ago

🚀 We're excited to share major improvements to our Janus-Pro-7B Text-to-Image Generation Space! 🎨What's New: 1-Critical Bug Fixes 2-Enhanced Features 3-UI Improvements 4-Performance Boost Try It Now: https://huggingface.co/spaces/Bils/DeepseekJanusPro-Image

upvoted an article 25 days ago

Mastering Long Contexts in LLMs with KVPress

View all activity

Organizations

rbgo's activity

upvoted a paper 12 days ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 13 days ago • 134

upvoted 2 articles 25 days ago

Article

Mastering Long Contexts in LLMs with KVPress

By

and 1 other •

Jan 23

• 63

Article

Open-R1: a fully open reproduction of DeepSeek-R1

27 days ago

• 770

upvoted 2 collections 28 days ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 3 items • Updated 28 days ago • 360

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 28 days ago • 100

upvoted 2 collections about 1 month ago

DeepSeek-V2

8 items • Updated Jan 3 • 27

DeepSeek-LLM

DeepSeek LLM series • 5 items • Updated Aug 16, 2024 • 13

upvoted a paper about 1 month ago

KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models

Paper • 2412.06071 • Published Dec 8, 2024 • 9

upvoted an article about 1 month ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

Jan 16

• 39

upvoted a paper about 2 months ago

Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 107

upvoted a collection 2 months ago

PaliGemma 2 Release

Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 141

upvoted a paper 2 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 346

upvoted 4 collections 2 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Nov 28, 2024 • 525

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 357

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 570

Qwen

Qwen • 16 items • Updated Nov 28, 2024 • 16

upvoted 2 papers 5 months ago

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14, 2024 • 126

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26, 2024 • 53

upvoted 2 papers 6 months ago

Scalable AI Safety via Doubly-Efficient Debate

Paper • 2311.14125 • Published Nov 23, 2023 • 2

Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions

Paper • 2406.09264 • Published Jun 13, 2024 • 1