Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
aslessor 's Collections
Document conversion
UI-to-Code
Prompts
Image
CoT
Medical
Agents
Synthetic Data
Text to image papers
Datasets
Vision
Audio
Evaluation
Video
Speech
Fine tuning
RAG

Image

updated about 6 hours ago
Upvote
-

  • ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation

    Paper • 2506.18095 • Published Jun 22 • 65

  • VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual Documents

    Paper • 2507.04590 • Published Jul 7 • 16

  • Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation

    Paper • 2509.00428 • Published 7 days ago • 11
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets OCR模型免费转Markdown Pricing 模型下载攻略