Image - a aslessor Collection

aslessor 's Collections

Document conversion

Prompts

Image

CoT

Medical

Agents

Text to image papers

Vision

Audio

Video

Speech

RAG

Image

updated about 6 hours ago

ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation

Paper • 2506.18095 • Published Jun 22 • 65
VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual Documents

Paper • 2507.04590 • Published Jul 7 • 16
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation

Paper • 2509.00428 • Published 7 days ago • 11