MMTEB: Massive Multilingual Text Embedding Benchmark Paper ā¢ 2502.13595 ā¢ Published 4 days ago ā¢ 26
From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions Paper ā¢ 2502.13791 ā¢ Published 4 days ago ā¢ 5
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper ā¢ 2501.17161 ā¢ Published 26 days ago ā¢ 106
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper ā¢ 2501.07301 ā¢ Published Jan 13 ā¢ 91
METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring Paper ā¢ 2501.02045 ā¢ Published Jan 3 ā¢ 21
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation Paper ā¢ 2501.01895 ā¢ Published Jan 3 ā¢ 51
LiveBench: A Challenging, Contamination-Free LLM Benchmark Paper ā¢ 2406.19314 ā¢ Published Jun 27, 2024 ā¢ 23
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper ā¢ 2412.06559 ā¢ Published Dec 9, 2024 ā¢ 80
PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion Paper ā¢ 2412.17780 ā¢ Published Dec 23, 2024 ā¢ 4
Bridging the Data Provenance Gap Across Text, Speech and Video Paper ā¢ 2412.17847 ā¢ Published Dec 19, 2024 ā¢ 9
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. ā¢ 40 items ā¢ Updated 10 days ago ā¢ 81
view article Article Comparing Open-source and Proprietary LLMs in Medical AI By mpimentel ā¢ Oct 3, 2024 ā¢ 16
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models Paper ā¢ 2412.02980 ā¢ Published Dec 4, 2024 ā¢ 13
Insight-V Collection Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models ā¢ 5 items ā¢ Updated Nov 22, 2024 ā¢ 9
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 ā¢ 9 items ā¢ Updated Nov 27, 2024 ā¢ 109