Aritra Roy Gosthipaty's picture

Aritra Roy Gosthipaty PRO

ariG23498

·

https://arig23498.github.io/

AI & ML interests

Deep Representation Learning

Recent Activity

upvoted a collection about 11 hours ago

updated a dataset about 20 hours ago

model-metadata/custom_code_execution_files

updated a dataset about 20 hours ago

model-metadata/models_executed_urls

View all activity

Organizations

upvoted a collection about 11 hours ago

SuryaBench

Benchmark Dataset for Advancing Machine Learning in Heliophysics and Space Weather Prediction • 8 items • Updated 2 days ago • 3

upvoted an article 1 day ago

Article

Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training

By

and 4 others •

13 days ago

• 50

upvoted an article 2 days ago

Article

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

By

and 1 other •

3 days ago

• 32

upvoted a collection 2 days ago

NVIDIA Nemotron

Open, Production-ready Enterprise Models. Nvidia Open Model license. • 3 items • Updated 2 days ago • 39

upvoted 2 papers 3 days ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 68

DINOv3

Paper • 2508.10104 • Published 7 days ago • 126

upvoted a collection 6 days ago

DINOv3

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated 5 days ago • 195

upvoted a paper 7 days ago

Technical Report: Full-Stack Fine-Tuning for the Q Programming Language

Paper • 2508.06813 • Published 12 days ago • 5

upvoted a collection 8 days ago

qqWen-Series

Based off the Qwen-2.5 Series - model finetuned for the Q programming language. • 6 items • Updated 14 days ago • 8

upvoted an article 8 days ago

Article

🕳️ Attention Sinks in LLMs for endless fluency

By

•

Oct 9, 2023

• 15

upvoted a paper 8 days ago

Aryabhata: An exam-focused language model for JEE Math

Paper • 2508.08665 • Published 9 days ago • 16

upvoted an article 8 days ago

Article

Optimization story: Bloom inference

By

•

Oct 12, 2022

• 6

upvoted a collection 8 days ago

👁️ LFM2-VL

LFM2-VL is our first series of vision-language models, designed for on-device deployment. • 6 items • Updated 1 day ago • 31

upvoted an article 8 days ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

By

and 11 others •

16 days ago

• 467

upvoted 2 articles 9 days ago

Article

Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub

By

and 6 others •

Jun 12

• 125

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

By

and 4 others •

9 days ago

• 60

upvoted 2 collections 10 days ago

LLMDet

See: https://github.com/huggingface/transformers/pull/37925 • 3 items • Updated Jun 26 • 3

MM Grounding DINO

See: https://github.com/huggingface/transformers/pull/37925 • 8 items • Updated Jun 26 • 4

upvoted an article 13 days ago

Article

Vision Language Model Alignment in TRL ⚡️

By

and 4 others •

14 days ago

• 69

upvoted a collection 15 days ago

gpt-oss

Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated 14 days ago • 311