Xi's picture

Xi

xi0v

·

AI & ML interests

RL, Model merging, Model Editing and Vision/Multimodal Model Fine-tuning.

Recent Activity

liked a model about 7 hours ago

trojblue/sdxl-finetune-pen-feel

liked a dataset about 10 hours ago

trojblue/danbooru2025-metadata

liked a model about 10 hours ago

shadowlilac/OpenGemini-Flash-RLVR

View all activity

Organizations

upvoted an article 8 days ago

Article

The Optimal Architecture for Small Language Models

9 days ago

•

73

upvoted a paper about 1 month ago

Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance

Paper • 2511.13254 • Published Nov 17, 2025 • 136

upvoted a collection about 2 months ago

timm DINOv3

Meta AI's DINOv3 weights in timm. ViTs with `qkvb` have a zero QV bias present, otherwise bias is disabled. QKV bias are all 0 in original weights. • 18 items • Updated Sep 19, 2025 • 26

upvoted a paper about 2 months ago

Black-Box On-Policy Distillation of Large Language Models

Paper • 2511.10643 • Published Nov 13, 2025 • 49

upvoted an article about 2 months ago

Article

Projected Abliteration

Oct 25, 2025

•

35

upvoted a paper 2 months ago

π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

Paper • 2510.25889 • Published Oct 29, 2025 • 64

upvoted a collection 3 months ago

_Originals

37 items • Updated Jan 20, 2025 • 1

upvoted a paper 4 months ago

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10, 2025 • 56

upvoted 2 papers 5 months ago

Puppeteer: Rig and Animate Your 3D Models

Paper • 2508.10898 • Published Aug 14, 2025 • 33

Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory

Paper • 2508.09736 • Published Aug 13, 2025 • 57

upvoted a collection 5 months ago

Hybrid Linear Attention Research

All 1.3B & 340M hybrid linear-attention experiments. • 62 items • Updated Sep 11, 2025 • 12

upvoted a paper 5 months ago

Geometric-Mean Policy Optimization

Paper • 2507.20673 • Published Jul 28, 2025 • 31

upvoted an article 6 months ago

Article

Vibe coding for data science: how to label a dataset with Kimi K2

Jul 22, 2025

•

21

upvoted 5 papers 6 months ago

Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities

Paper • 2507.13158 • Published Jul 17, 2025 • 23

One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11, 2025 • 31

Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation

Paper • 2507.02608 • Published Jul 3, 2025 • 21

Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3, 2025 • 25

FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing

Paper • 2506.20911 • Published Jun 26, 2025 • 41

upvoted an article 6 months ago

Article

Gemma 3n fully available in the open-source ecosystem!

+6

Jun 26, 2025

•

120

upvoted a paper 7 months ago

A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA

Paper • 2312.03732 • Published Nov 28, 2023 • 12