Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2406.09415

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 26
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 42
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 22

Running on CPU Upgrade

1.99k

1.99k

Anychat

🏢

Select and display code snippets for various AI providers
Running

266

266

Qwen2.5 Coder Artifacts

🐢

Generate application code with Qwen2.5-Coder-32B
Running

884

884

QwQ-32B-Preview

🔍

QwQ-32B-Preview
Running on CPU Upgrade

12.6k

12.6k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 51
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

Paper • 2406.09406 • Published Jun 13, 2024 • 15
VideoGUI: A Benchmark for GUI Automation from Instructional Videos

Paper • 2406.10227 • Published Jun 14, 2024 • 9
What If We Recaption Billions of Web Images with LLaMA-3?

Paper • 2406.08478 • Published Jun 12, 2024 • 39

Depth Anything V2

Paper • 2406.09414 • Published Jun 13, 2024 • 97
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 51
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion

Paper • 2406.04338 • Published Jun 6, 2024 • 37
SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 113

Cellular Automata DL

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 51

Depth Estimation

Depth Anything V2

Paper • 2406.09414 • Published Jun 13, 2024 • 97
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 51

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 51

Relevant-Papers-Midterm

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19, 2024 • 18
The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published Jun 6, 2024 • 58
CRAG -- Comprehensive RAG Benchmark

Paper • 2406.04744 • Published Jun 7, 2024 • 45
Transformers meet Neural Algorithmic Reasoners

Paper • 2406.09308 • Published Jun 13, 2024 • 44

Cognitively Inspired Energy-Based World Models

Paper • 2406.08862 • Published Jun 13, 2024 • 10
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 51
OpenVLA: An Open-Source Vision-Language-Action Model

Paper • 2406.09246 • Published Jun 13, 2024 • 37

MotionLLM: Understanding Human Behaviors from Human Motions and Videos

Paper • 2405.20340 • Published May 30, 2024 • 20
Spectrally Pruned Gaussian Fields with Neural Compensation

Paper • 2405.00676 • Published May 1, 2024 • 10
Paint by Inpaint: Learning to Add Image Objects by Removing Them First

Paper • 2404.18212 • Published Apr 28, 2024 • 29
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29, 2024 • 120

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs