Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2405.17247

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 181
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training

Paper • 2401.00849 • Published Jan 1, 2024 • 17
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 50
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 40

Large Language Model (LLM) and NLP related papers.

about 14 hours ago

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 21
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 13
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 26
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 42
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 22

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 21 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 83
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 146
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 87
A Primer on Large Language Models and their Limitations

Paper • 2412.04503 • Published Dec 3, 2024

Paper - Multimodal

Paper related to Multimodal Model - Research for a : Modular, Multimodal, Multi-Stream, Mixture of Expert, Universal Transformer, Matryoshka embedding

Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Paper • 2412.15213 • Published Dec 19, 2024 • 26
No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 41
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 134
Autoregressive Video Generation without Vector Quantization

Paper • 2412.14169 • Published Dec 18, 2024 • 14

Running on CPU Upgrade

1.99k

1.99k

Anychat

🏢

Select and display code snippets for various AI providers
Running

266

266

Qwen2.5 Coder Artifacts

🐢

Generate application code with Qwen2.5-Coder-32B
Running

884

884

QwQ-32B-Preview

🔍

QwQ-32B-Preview
Running on CPU Upgrade

12.6k

12.6k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots

Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

Paper • 2409.18124 • Published Sep 26, 2024 • 32
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness

Paper • 2409.18125 • Published Sep 26, 2024 • 34
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices

Paper • 2410.11795 • Published Oct 15, 2024 • 18
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 87

LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs

Paper • 2408.13467 • Published Aug 24, 2024 • 25
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 87
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 52

multilingual vision models

Some papers I read for understanding vision models and also adding multilingual capabilities to them

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 87
Visual Instruction Tuning

Paper • 2304.08485 • Published Apr 17, 2023 • 13
Improved Baselines with Visual Instruction Tuning

Paper • 2310.03744 • Published Oct 5, 2023 • 37
PALO: A Polyglot Large Multimodal Model for 5B People

Paper • 2402.14818 • Published Feb 22, 2024 • 23

Previous
1
2
3
...
5
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs