1 56 100

wattai

wattai

AI & ML interests

Im interested in generating BMS charts from text and music prompts.

Recent Activity

upvoted a paper 3 days ago

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

upvoted a paper 3 days ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

upvoted a paper 4 days ago

A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers

View all activity

Organizations

None yet

upvoted 2 papers 3 days ago

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

Paper • 2509.01644 • Published 5 days ago • 25

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published 4 days ago • 144

upvoted a paper 4 days ago

A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers

Paper • 2508.21148 • Published 9 days ago • 129

liked 2 models 15 days ago

pyannote/speaker-diarization-3.1

Automatic Speech Recognition • Updated May 10, 2024 • 17.5M • 1.1k

nvidia/diar_streaming_sortformer_4spk-v2

Audio Classification • Updated 23 days ago • 9.73k • 39

upvoted an article about 2 months ago

Article

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

and 1 other •

Jul 9

• 666

upvoted a paper 2 months ago

Depth Anything at Any Condition

Paper • 2507.01634 • Published Jul 2 • 51

upvoted an article 2 months ago

Article

Gemma 3n fully available in the open-source ecosystem!

and 7 others •

Jun 26

• 116

upvoted an article 3 months ago

Article

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

and 4 others •

Jun 19

• 86

upvoted a paper 3 months ago

Discrete Diffusion in Large Language and Multimodal Models: A Survey

Paper • 2506.13759 • Published Jun 16 • 44

liked 5 Spaces 3 months ago

OpenOCR Demo

😻

OCR System. Homepage: https://github.com/Topdu/OpenOCR

108

PaddleOCR

⚡

Extract text from images in multiple languages

Paddle Ocr Demo

🦀

Recognize text in images

205

EasyOCR

🔥

Extract text from images in multiple languages

Sa2VA Simple Demo

🐨

Dense Grounded Understanding of Images and Videos

liked a model 3 months ago

ByteDance/Sa2VA-4B

Image-Text-to-Text • 4B • Updated Mar 19 • 601 • • 78

upvoted a paper 3 months ago

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Paper • 2501.04001 • Published Jan 7 • 47

liked a model 3 months ago

shi-labs/oneformer_ade20k_swin_large

Image Segmentation • Updated Jan 19, 2023 • 80.2k • • 28

upvoted a paper 3 months ago

Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design

Paper • 2506.04734 • Published Jun 5 • 19

upvoted a paper 4 months ago

MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21 • 96

wattai

AI & ML interests

Recent Activity

Organizations

wattai's activity

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Gemma 3n fully available in the open-source ecosystem!

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

OpenOCR Demo

PaddleOCR

Paddle Ocr Demo

EasyOCR

Sa2VA Simple Demo