5 188 2

Bhimraj Yadav PRO

bhimrazy

https://bhimraj.com.np

AI & ML interests

Computer Vision, Healthcare, Generative AI and NLP

Recent Activity

upvoted a paper 2 days ago

Qwen2.5-VL Technical Report

upvoted a paper 2 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

upvoted a paper 3 days ago

SearchRAG: Can Search Engines Be Helpful for LLM-based Medical Question Answering?

View all activity

Organizations

bhimrazy's activity

upvoted 2 papers 2 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 4 days ago • 136

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 3 days ago • 99

upvoted 3 papers 3 days ago

SearchRAG: Can Search Engines Be Helpful for LLM-based Medical Question Answering?

Paper • 2502.13233 • Published 5 days ago • 11

Craw4LLM: Efficient Web Crawling for LLM Pretraining

Paper • 2502.13347 • Published 5 days ago • 24

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published 5 days ago • 42

upvoted 15 papers 4 days ago

ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Paper • 2501.10132 • Published Jan 17 • 19

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published Jan 22 • 83

Baichuan-Omni-1.5 Technical Report

Paper • 2501.15368 • Published 29 days ago • 61

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 29 days ago • 61

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published 25 days ago • 55

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 19 days ago • 190

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 13 days ago • 134

Scaling Pre-training to One Hundred Billion Data for Vision Language Models

Paper • 2502.07617 • Published 12 days ago • 27

Competitive Programming with Large Reasoning Models

Paper • 2502.06807 • Published 20 days ago • 62

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 11 days ago • 139

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

Paper • 2502.10391 • Published 9 days ago • 29