Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2509.04419

This collection is a list of papers I find to be very interesting.

about 2 hours ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 624
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 294
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published 2 days ago • 101

about 24 hours ago

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published 24 days ago • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published 23 days ago • 17
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published 18 days ago • 45

Representation & Optimization

Understanding about representation sheds light on optimization

about 11 hours ago

Nuclear Norm Regularization for Deep Learning

Paper • 2405.14544 • Published May 23, 2024 • 1
Token embeddings violate the manifold hypothesis

Paper • 2504.01002 • Published Apr 1 • 1
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers

Paper • 2403.10476 • Published Mar 15, 2024 • 1
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Paper • 2504.00254 • Published Mar 31 • 1

about 8 hours ago

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 141
Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs

Paper • 2409.14988 • Published Sep 23, 2024 • 24
LLM Pretraining with Continuous Concepts

Paper • 2502.08524 • Published Feb 12 • 29

about 22 hours ago

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

Paper • 2509.02522 • Published 3 days ago • 22
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published 9 days ago • 85
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published 3 days ago • 139
Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published 1 day ago • 43

about 13 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 281 • 95
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 35
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 98
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 89

about 18 hours ago

Rethinking Data Selection at Scale: Random Selection is Almost All You Need

Paper • 2410.09335 • Published Oct 12, 2024 • 17
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning

Paper • 2410.06456 • Published Oct 9, 2024 • 38
Emergent properties with repeated examples

Paper • 2410.07041 • Published Oct 9, 2024 • 8
Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 71

about 21 hours ago

When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

Paper • 2402.17193 • Published Feb 27, 2024 • 27
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Paper • 2410.23743 • Published Oct 31, 2024 • 64
Direct Preference Optimization Using Sparse Feature-Level Constraints

Paper • 2411.07618 • Published Nov 12, 2024 • 17
Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 55

This collection is a list of papers I find to be very interesting.

about 2 hours ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 624
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 294
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published 2 days ago • 101

about 22 hours ago

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

Paper • 2509.02522 • Published 3 days ago • 22
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published 9 days ago • 85
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published 3 days ago • 139
Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published 1 day ago • 43

about 24 hours ago

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published 24 days ago • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published 23 days ago • 17
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published 18 days ago • 45

about 13 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 281 • 95
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 35
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 98
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 89

Representation & Optimization

Understanding about representation sheds light on optimization

about 11 hours ago

Nuclear Norm Regularization for Deep Learning

Paper • 2405.14544 • Published May 23, 2024 • 1
Token embeddings violate the manifold hypothesis

Paper • 2504.01002 • Published Apr 1 • 1
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers

Paper • 2403.10476 • Published Mar 15, 2024 • 1
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Paper • 2504.00254 • Published Mar 31 • 1

about 18 hours ago

Rethinking Data Selection at Scale: Random Selection is Almost All You Need

Paper • 2410.09335 • Published Oct 12, 2024 • 17
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning

Paper • 2410.06456 • Published Oct 9, 2024 • 38
Emergent properties with repeated examples

Paper • 2410.07041 • Published Oct 9, 2024 • 8
Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 71

about 8 hours ago

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published Sep 17, 2024 • 32
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 141
Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs

Paper • 2409.14988 • Published Sep 23, 2024 • 24
LLM Pretraining with Continuous Concepts

Paper • 2502.08524 • Published Feb 12 • 29

about 21 hours ago

When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

Paper • 2402.17193 • Published Feb 27, 2024 • 27
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Paper • 2410.23743 • Published Oct 31, 2024 • 64
Direct Preference Optimization Using Sparse Feature-Level Constraints

Paper • 2411.07618 • Published Nov 12, 2024 • 17
Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 55

Company

TOS Privacy About Jobs

Website

Models Datasets OCR模型免费转Markdown Pricing 模型下载攻略