5 19 107

Jaeyoon Jung PRO

lastdefiance20

AI & ML interests

multimodal

Recent Activity

upvoted a paper 3 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

liked a model 4 days ago

perplexity-ai/r1-1776

upvoted a paper 4 days ago

Magma: A Foundation Model for Multimodal AI Agents

View all activity

Organizations

lastdefiance20's activity

upvoted a paper 3 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 3 days ago • 99

liked a model 4 days ago

perplexity-ai/r1-1776

Updated 4 days ago • 7.75k • 1.51k

upvoted a paper 4 days ago

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published 5 days ago • 42

liked a dataset 8 days ago

open-thoughts/OpenThoughts-114k

Viewer • Updated 3 days ago • 228k • 103k • 588

upvoted a paper 11 days ago

Scaling Pre-training to One Hundred Billion Data for Vision Language Models

Paper • 2502.07617 • Published 12 days ago • 27

upvoted a paper 12 days ago

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Paper • 2501.08326 • Published Jan 14 • 32

liked a model 14 days ago

ibm-granite/granite-vision-3.1-2b-preview

Image-Text-to-Text • Updated 2 days ago • 9.65k • 79

liked a dataset about 1 month ago

DAMO-NLP-SG/multimodal_textbook

Updated Jan 11 • 6.33k • 132

upvoted a paper about 1 month ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 78

liked a dataset about 1 month ago

HumanLLMs/Human-Like-DPO-Dataset

Viewer • Updated Jan 12 • 10.9k • 2.7k • 198

upvoted a paper about 1 month ago

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published Jan 7 • 49

upvoted a paper about 2 months ago

LearnLM: Improving Gemini for Learning

Paper • 2412.16429 • Published Dec 21, 2024 • 22

liked a model about 2 months ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 22 days ago • 1.09M • 3.36k

upvoted a paper about 2 months ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published Dec 16, 2024 • 55

liked a model 2 months ago

ibm-granite/granite-3.1-8b-instruct

Text Generation • Updated 24 days ago • 94.1k • 151

upvoted a paper 2 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 346

liked 2 models 2 months ago

answerdotai/ModernBERT-base

Fill-Mask • Updated Jan 15 • 10M • 765

vidore/colpali-v1.3-hf

Visual Document Retrieval • Updated 18 days ago • 1.48k • 23

liked a dataset 2 months ago

BAAI/Infinity-MM

Updated Dec 13, 2024 • 13.1k • 90

liked a model 2 months ago

google/Gemma-Embeddings-v1.0

Updated Dec 16, 2024 • 98 • 120