Yatharth Sharma's picture

Yatharth Sharma

YaTharThShaRma999

·

AI & ML interests

None yet

Recent Activity

updated a model 1 day ago

YaTharThShaRma999/sound_effects

published a model 1 day ago

YaTharThShaRma999/sound_effects

new activity 3 days ago

stabilityai/stable-diffusion:2.1 elsewhere?

View all activity

Organizations

None yet

YaTharThShaRma999's activity

upvoted a paper 3 days ago

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Paper • 2502.13128 • Published 5 days ago • 34

upvoted a paper 6 days ago

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published 9 days ago • 49

upvoted a paper 11 days ago

Magic 1-For-1: Generating One Minute Video Clips within One Minute

Paper • 2502.07701 • Published 12 days ago • 32

upvoted a paper 24 days ago

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text

Paper • 2501.15654 • Published 28 days ago • 12

upvoted a paper 27 days ago

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published about 1 month ago • 24

upvoted 3 papers about 1 month ago

RepVideo: Rethinking Cross-Layer Representation for Video Generation

Paper • 2501.08994 • Published Jan 15 • 15

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Paper • 2501.06282 • Published Jan 10 • 45

Multi-task retriever fine-tuning for domain-specific and efficient RAG

Paper • 2501.04652 • Published Jan 8 • 10

upvoted 7 papers about 2 months ago

Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published Jan 7 • 69

LTX-Video: Realtime Video Latent Diffusion

Paper • 2501.00103 • Published Dec 30, 2024 • 42

HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving

Paper • 2412.20735 • Published Dec 30, 2024 • 11

Xmodel-2 Technical Report

Paper • 2412.19638 • Published Dec 27, 2024 • 26

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

Paper • 2412.21187 • Published Dec 30, 2024 • 39

Slow Perception: Let's Perceive Geometric Figures Step-by-step

Paper • 2412.20631 • Published Dec 30, 2024 • 15

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

Paper • 2412.21037 • Published Dec 30, 2024 • 24

upvoted 5 papers 2 months ago

IDOL: Instant Photorealistic 3D Human Creation from a Single Image

Paper • 2412.14963 • Published Dec 19, 2024 • 6

Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Paper • 2412.15213 • Published Dec 19, 2024 • 26

Autoregressive Video Generation without Vector Quantization

Paper • 2412.14169 • Published Dec 18, 2024 • 14

FastVLM: Efficient Vision Encoding for Vision Language Models

Paper • 2412.13303 • Published Dec 17, 2024 • 13

VidTok: A Versatile and Open-Source Video Tokenizer

Paper • 2412.13061 • Published Dec 17, 2024 • 8