Biao Gong's picture

7 7

Biao Gong

BiaoGong

·

https://scholar.google.com/citations?user=BwdpTiQAAAAJ

Jack969kk

AI & ML interests

Generative Model, Retrieval, 3D Vision

Organizations

None yet

authored a paper 3 months ago

Ming-Omni: A Unified Multimodal Model for Perception and Generation

Paper • 2506.09344 • Published Jun 11 • 27

authored a paper 4 months ago

Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction

Paper • 2505.02471 • Published May 5 • 12

authored a paper 9 months ago

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Paper • 2412.03085 • Published Dec 4, 2024 • 12

authored 3 papers 11 months ago

Framer: Interactive Frame Interpolation

Paper • 2410.18978 • Published Oct 24, 2024 • 38

A Recipe for Scaling up Text-to-Video Generation with Text-free Videos

Paper • 2312.15770 • Published Dec 25, 2023 • 15

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14, 2024 • 57

authored 10 papers over 1 year ago

ViM: Vision Middleware for Unified Downstream Transferring

Paper • 2303.06911 • Published Mar 13, 2023

Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos

Paper • 2303.08345 • Published Mar 15, 2023

Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following

Paper • 2311.17002 • Published Nov 28, 2023 • 5

Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation

Paper • 2311.15773 • Published Nov 27, 2023 • 4

Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

Paper • 2311.15841 • Published Nov 27, 2023 • 2

UKnow: A Unified Knowledge Protocol for Common-Sense Reasoning and Vision-Language Pre-training

Paper • 2302.06891 • Published Feb 14, 2023

Deep Multi-View Enhancement Hashing for Image Retrieval

Paper • 2002.00169 • Published Feb 1, 2020

VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval

Paper • 2211.12764 • Published Nov 23, 2022

Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning

Paper • 2303.15230 • Published Mar 27, 2023

Logic Diffusion for Knowledge Graph Reasoning

Paper • 2306.03515 • Published Jun 6, 2023