sree

srisree

AI & ML interests

None yet

Recent Activity

liked a model 4 days ago

Linum-AI/linum-v2-360p

liked a model 5 days ago

sweepai/sweep-next-edit-1.5B

liked a model 7 days ago

stepfun-ai/Step3-VL-10B

View all activity

Organizations

upvoted an article 7 days ago

Article

The Optimal Architecture for Small Language Models

Dec 26, 2025

•

111

upvoted an article about 1 month ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

Dec 18, 2025

•

116

upvoted a paper about 1 month ago

Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition

Paper • 2512.15603 • Published Dec 17, 2025 • 63

upvoted an article about 1 month ago

Article

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

Dec 17, 2025

•

upvoted an article 3 months ago

Article

Provence: efficient and robust context pruning for retrieval-augmented generation

Jan 28, 2025

•

upvoted a paper 3 months ago

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16, 2025 • 112

upvoted 3 papers 4 months ago

Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention

Paper • 2510.04212 • Published Oct 5, 2025 • 24

Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models

Paper • 2510.03561 • Published Oct 3, 2025 • 25

SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights

Paper • 2509.22944 • Published Sep 26, 2025 • 80

upvoted an article 5 months ago

Article

Introducing Pivotal Token Search (PTS): Targeting Critical Decision Points in LLM Training

May 17, 2025

•

upvoted 4 articles 7 months ago

Article

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Jul 9, 2025

•

771

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8, 2025

•

751

Article

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

Jun 19, 2025

•

Article

Gemma 3n fully available in the open-source ecosystem!

Jun 26, 2025

•

120

upvoted an article 11 months ago

Article

Uncensor any LLM with abliteration

Jun 13, 2024

•

765

upvoted a paper 11 months ago

VACE: All-in-One Video Creation and Editing

Paper • 2503.07598 • Published Mar 10, 2025 • 56

upvoted 2 articles almost 2 years ago

Article

Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU!

Apr 21, 2024

•

Article

Outpainting III - Inpaint Model

Apr 23, 2024

•

upvoted a paper almost 2 years ago

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12, 2024 • 70

upvoted a collection almost 2 years ago

Qwen1.5

Collection

Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated 27 days ago • 213

sree

AI & ML interests

Recent Activity

Organizations

srisree's activity

The Optimal Architecture for Small Language Models

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

Provence: efficient and robust context pruning for retrieval-augmented generation

Introducing Pivotal Token Search (PTS): Targeting Critical Decision Points in LLM Training

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

SmolLM3: smol, multilingual, long-context reasoner

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

Gemma 3n fully available in the open-source ecosystem!

Uncensor any LLM with abliteration

Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU!

Outpainting III - Inpaint Model

🎉 Free Image Generator Now Available!