Yash Marathe's picture

Yash Marathe

yashmarathe

·

AI & ML interests

None yet

Recent Activity

liked a model about 16 hours ago

nasa-ibm-ai4science/Surya-1.0

liked a model 4 days ago

sentence-transformers/static-retrieval-mrl-en-v1

liked a model 8 days ago

ByteDance-Seed/M3-Agent-Control

View all activity

Organizations

upvoted 3 collections about 1 month ago

SuperBPE

SuperBPE tokenizers and models trained with them • 8 items • Updated Apr 10 • 15

💧 LFM2

LFM2 is a new generation of hybrid models, designed for on-device deployment. • 15 items • Updated 2 days ago • 91

Hybrid Linear Attention Research

All 1.3B & 340M hybrid linear-attention experiments. • 60 items • Updated Jul 7 • 9

upvoted 2 collections 2 months ago

Avey 1 Research Preview

1.5B preview models trained on 100B tokens of FineWeb, and an instruct-tuned version on smoltalk. • 3 items • Updated Jun 16 • 6

V-JEPA 2

A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13 • 156

upvoted 2 collections 3 months ago

Falcon-H1

Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 38 items • Updated 21 days ago • 51

LipSync and Face Operations

21 items • Updated May 19 • 56

upvoted 5 collections 4 months ago

Perception LM

7 items • Updated Apr 17 • 61

Perception Encoder

17 items • Updated Jul 11 • 64

Skywork-OR1

Skywork Open Reasoner 1 • 11 items • Updated May 29 • 31

Kimina Prover Preview

State-of-the-Art Models for Formal Mathematical Reasoning • 5 items • Updated Apr 28 • 33

Kimi-VL-A3B

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated Jul 1 • 74

upvoted an article 5 months ago

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

By

•

May 7, 2024

• 96

upvoted a collection 5 months ago

Cosmos

The collection of Cosmos models • 31 items • Updated 7 days ago • 296

upvoted a collection 6 months ago

L1

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning • 7 items • Updated Jul 13 • 7

upvoted a collection 7 months ago

SYNTHETIC-1

A collection of tasks & verifiers for reasoning datasets • 9 items • Updated Jul 14 • 63

upvoted an article 7 months ago

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

By

and 3 others •

Feb 4

• 169

upvoted 2 papers 7 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 283

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 100

upvoted a collection 8 months ago

Toy Models to Study

9 items • Updated Mar 17, 2024 • 2