Mahmud ElHuseyni 🇵🇸's picture

132 142

Mahmud ElHuseyni 🇵🇸

MElHuseyni

·

AI & ML interests

Computer Vision NLP Machine Learning

Recent Activity

upvoted a paper about 18 hours ago

Ovis2.5 Technical Report

updated a collection about 18 hours ago

Image Segmentation Models 🍪

upvoted a collection 1 day ago

interesting architecture

View all activity

Organizations

upvoted a paper about 18 hours ago

Ovis2.5 Technical Report

Paper • 2508.11737 • Published 6 days ago • 93

upvoted a collection 1 day ago

interesting architecture

19 items • Updated 2 days ago • 2

upvoted an article 3 days ago

Article

Exploring the Daily Papers Page on Hugging Face

By

•

Sep 23, 2024

• 63

upvoted a paper 3 days ago

A Survey on Diffusion Language Models

Paper • 2508.10875 • Published 6 days ago • 31

upvoted a collection 3 days ago

Canary

A collection of multilingual and multitask speech to text models from NVIDIA NeMo 🐤 • 5 items • Updated 6 days ago • 25

upvoted 2 collections 6 days ago

ParallelSearch

Checkpoints for our paper "ParallelSearch: Train your LLMs to Decompose Query and Search Sub-queries in Parallel with Reinforcement Learning" • 3 items • Updated 13 days ago • 5

DINOv3

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated 6 days ago • 195

upvoted a collection 7 days ago

Qwen3

84 items • Updated 15 days ago • 1.11k

upvoted a paper 7 days ago

IAUNet: Instance-Aware U-Net

Paper • 2508.01928 • Published 17 days ago • 8

upvoted an article 8 days ago

Article

Vision Language Model Alignment in TRL ⚡️

By

and 4 others •

14 days ago

• 69

upvoted a paper 8 days ago

WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published 13 days ago • 114

upvoted 2 collections 8 days ago

Diffusion Model

49 items • Updated Aug 19, 2024 • 9

👁️ LFM2-VL

LFM2-VL is our first series of vision-language models, designed for on-device deployment. • 6 items • Updated 2 days ago • 31

upvoted a paper 8 days ago

Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents

Paper • 2502.04223 • Published Feb 6 • 11

upvoted an article 8 days ago

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

By

and 4 others •

10 days ago

• 60

upvoted a paper 9 days ago

A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published 11 days ago • 82

upvoted a collection 9 days ago

Turkish Hallucination Detection Models

6 items • Updated 29 days ago • 6

upvoted an article 10 days ago

Article

Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training

By

and 4 others •

13 days ago

• 50

upvoted a paper 11 days ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published 13 days ago • 151

upvoted a collection 12 days ago

Vision Language Leaderboards

This collection has all the vision language leaderboards. • 7 items • Updated Aug 24, 2024 • 21