Eugene Oskin's picture

31 46

Eugene Oskin

eoskin

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 11 hours ago

Phantom: Subject-consistent video generation via cross-modal alignment

upvoted a paper about 12 hours ago

PAFT: Prompt-Agnostic Fine-Tuning

upvoted a paper 1 day ago

Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking

View all activity

Organizations

None yet

eoskin's activity

upvoted a paper about 11 hours ago

Phantom: Subject-consistent video generation via cross-modal alignment

Paper • 2502.11079 • Published 7 days ago • 49

upvoted a paper about 12 hours ago

PAFT: Prompt-Agnostic Fine-Tuning

Paper • 2502.12859 • Published 5 days ago • 12

upvoted a paper 1 day ago

Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking

Paper • 2501.00244 • Published Dec 31, 2024 • 1

upvoted a paper 3 days ago

Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 17

upvoted 3 papers 4 days ago

Retrofitting Word Vectors to Semantic Lexicons

Paper • 1411.4166 • Published Nov 15, 2014 • 1

StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 31

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 19 days ago • 190

upvoted 7 papers 5 days ago

ReLearn: Unlearning via Learning for Large Language Models

Paper • 2502.11190 • Published 7 days ago • 28

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Paper • 2310.06770 • Published Oct 10, 2023 • 5

ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

Paper • 2502.09696 • Published 10 days ago • 38

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

Paper • 2502.08235 • Published 11 days ago • 53

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

Paper • 2502.06608 • Published 13 days ago • 32

Large Language Diffusion Models

Paper • 2502.09992 • Published 9 days ago • 75

Competitive Programming with Large Reasoning Models

Paper • 2502.06807 • Published 20 days ago • 62

upvoted an article 5 days ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 392

upvoted 2 papers 5 days ago

ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning

Paper • 2501.15316 • Published 29 days ago • 1

DarwinLM: Evolutionary Structured Pruning of Large Language Models

Paper • 2502.07780 • Published 12 days ago • 17

upvoted 3 papers 6 days ago

Shortened LLaMA: A Simple Depth Pruning for Large Language Models

Paper • 2402.02834 • Published Feb 5, 2024 • 16

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6, 2024 • 63

The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models

Paper • 2203.07259 • Published Mar 14, 2022 • 4