Nathan Simons's picture

Nathan Simons

JoeySalmons

·

AI & ML interests

I like AI

Recent Activity

liked a model 1 day ago

AIDC-AI/Ovis2-34B

upvoted a collection 1 day ago

liked a dataset 2 days ago

m-a-p/SuperGPQA

View all activity

Organizations

None yet

JoeySalmons's activity

upvoted a collection 1 day ago

Ovis2

Our latest advancement in multi-modal large language models (MLLMs) • 8 items • Updated 7 days ago • 51

upvoted a paper 2 days ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 3 days ago • 87

upvoted a collection 4 days ago

PaliGemma 2 Mix

13 items • Updated 4 days ago • 56

upvoted a collection 6 days ago

Step-Audio

Step-Audio model family, including Audio-Tokenizer, Audio-Chat and TTS • 3 items • Updated 6 days ago • 26

upvoted a collection 9 days ago

Hamanasu

A brand new series of Models from yours truly, Designed for Intelligence, Creativity and Roleplay. • 9 items • Updated 9 days ago • 4

upvoted a collection 11 days ago

OLMoE (January 2025)

Improved OLMoE for iOS app. Read more: https://allenai.org/blog/olmoe-app • 10 items • Updated 12 days ago • 9

upvoted an article 13 days ago

Article

Open R1: Update #2

By

and 6 others •

13 days ago

• 184

upvoted an article 19 days ago

Article

Open-source DeepResearch – Freeing our search agents

20 days ago

• 1.08k

upvoted a collection 19 days ago

SFTvsRL Models & Data

This collection contains 4 initial checkpoints for https://github.com/LeslieTrue/SFTvsRL and necessary data for V-IRL training. • 5 items • Updated 19 days ago • 8

upvoted a paper 19 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 26 days ago • 106

upvoted an article 25 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

27 days ago

• 770

upvoted a collection 25 days ago

DeepSeek R1 (All Versions)

DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 16 days ago • 194

upvoted a collection 27 days ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 3 items • Updated 28 days ago • 360

upvoted a collection 28 days ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 28 days ago • 100

upvoted 2 collections about 1 month ago

InternLM3

6 items • Updated 13 days ago • 23

Dolphin 3.0

Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 9 items • Updated 17 days ago • 92

upvoted a collection about 2 months ago

Cosmos

The collection of Cosmos models • 31 items • Updated Jan 17 • 262

upvoted an article about 2 months ago

Article

MMLU-Pro-NoMath

By

•

Jul 11, 2024

• 4

upvoted 2 collections 2 months ago

Granite Embedding Models

5 items • Updated 4 days ago • 5

Granite 3.1 Language Models

A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 9 items • Updated 4 days ago • 57