3 19 4

Samuel Arcadinho

SSamDav

SSamDav

AI & ML interests

None yet

Recent Activity

liked a Space 3 days ago

nanotron/ultrascale-playbook

upvoted a collection 5 days ago

Dria-Agent-a

upvoted a collection 5 days ago

Tiny-Agent-a

View all activity

Organizations

SSamDav's activity

liked a Space 3 days ago

1.38k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted 2 collections 5 days ago

Dria-Agent-a

Collection

powerful agentic models built for pythonic function calling • 4 items • Updated 9 days ago • 4

Tiny-Agent-a

Collection

fast and powerful agentic models designed to run on edge devices. • 6 items • Updated 11 days ago • 7

commented 2 papers 13 days ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published 16 days ago • 114 •

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published 16 days ago • 114 •

upvoted a paper 13 days ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published 16 days ago • 114

upvoted a paper 20 days ago

Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published 23 days ago • 20

commented a paper 20 days ago

Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published 23 days ago • 20 •

upvoted 2 papers 20 days ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published 23 days ago • 105

DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

Paper • 2411.04983 • Published Nov 7, 2024 • 11

upvoted a paper 25 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 26 days ago • 106

upvoted 2 papers 2 months ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 134

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 346

liked a dataset 3 months ago

HuggingFaceFW/fineweb-2

Viewer • Updated Jan 8 • 12.5B • 69.3k • 433

upvoted 4 papers 3 months ago

NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training

Paper • 2412.02030 • Published Dec 2, 2024 • 19

NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images

Paper • 2412.03517 • Published Dec 4, 2024 • 19

PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published Dec 4, 2024 • 128

Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Paper • 2411.07232 • Published Nov 11, 2024 • 65

liked a model 4 months ago

nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

Text Generation • Updated Oct 25, 2024 • 118k • • 2.02k

upvoted a paper 5 months ago

Agent S: An Open Agentic Framework that Uses Computers Like a Human

Paper • 2410.08164 • Published Oct 10, 2024 • 24