3 20 129

Jay

jaigouk

https://jaigouk.com

AI & ML interests

None yet

Recent Activity

updated a model about 18 hours ago

jaigouk/Qwen2.5-14B-GRPO

published a model about 18 hours ago

jaigouk/Qwen2.5-14B-GRPO

updated a model 1 day ago

jaigouk/qwhen2_5_3b_grpo

View all activity

Organizations

jaigouk's activity

upvoted a paper 3 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 4 days ago • 136

upvoted an article 4 days ago

Article

We now support VLMs in smolagents!

about 1 month ago

• 84

upvoted a paper 9 days ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 11 days ago • 139

upvoted 2 papers about 1 month ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 273

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14 • 55

upvoted a paper 2 months ago

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published Dec 17, 2024 • 92

upvoted a paper 6 months ago

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Paper • 2407.09025 • Published Jul 12, 2024 • 134

upvoted an article 8 months ago

Article

An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct

•

Jun 11, 2024

• 56

upvoted a paper 10 months ago

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 84

upvoted 4 papers 11 months ago

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11, 2024 • 90

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4, 2024 • 93

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21, 2024 • 52

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 76

upvoted a paper 12 months ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 609

upvoted 5 papers about 1 year ago