Şuayp Talha Kocabay's picture

Building on HF

Şuayp Talha Kocabay PRO

suayptalha

·

https://discord.com/users/suaypt

AI & ML interests

NLP, LLMs, Transformers, Merging, RNNs, CNNs, ANNs, Computer Vision and ML algorithms

Recent Activity

upvoted a paper 1 day ago

Falcon-H1R: Pushing the Reasoning Frontiers with a Hybrid Model for Efficient Test-Time Scaling

liked a model 1 day ago

tiiuae/Falcon-H1R-7B

liked a Space 5 days ago

alibayram/mteb-turkish

View all activity

Organizations

upvoted a paper 1 day ago

Falcon-H1R: Pushing the Reasoning Frontiers with a Hybrid Model for Efficient Test-Time Scaling

Paper • 2601.02346 • Published 2 days ago • 16

upvoted a paper 2 months ago

Superpositional Gradient Descent: Harnessing Quantum Principles for Model Training

Paper • 2511.01918 • Published Nov 1, 2025 • 11

upvoted a collection 8 months ago

Arcana Qwen3-2.4B-A0.6B

Qwen3 MoE model • 5 items • Updated Jul 25, 2025 • 2

upvoted an article 8 months ago

Article

Create Mixtures of Experts with MergeKit

Mar 28, 2024

•

27

upvoted a collection 8 months ago

Qwen3

84 items • Updated 8 days ago • 1.54k

upvoted a paper 9 months ago

Antidistillation Sampling

Paper • 2504.13146 • Published Apr 17, 2025 • 59

upvoted 3 collections 9 months ago

EchoLLaMA: 3D-to-Speech with Multimodal AI

This collection contains the models and datasets used in EchoLLaMA: 3D-to-Speech with Multimodal AI paper. • 4 items • Updated Apr 7, 2025 • 4

Llama 4

Llama 4 release • 13 items • Updated Apr 29, 2025 • 678

SLMs

4 items • Updated Mar 28, 2025 • 2

upvoted 2 papers 10 months ago

Model Stock: All we need is just a few fine-tuned models

Paper • 2403.19522 • Published Mar 28, 2024 • 13

Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25, 2025 • 50

upvoted 3 collections 11 months ago

Harezmi-25

Harezmi-25 is the Turkish chess engine project. • 3 items • Updated Feb 4, 2025 • 2

Maestro Models

Maestro LLMs based on DeepSeek's distilled models • 2 items • Updated Apr 6, 2025 • 2

DeepSeek-R1

10 items • Updated Nov 27, 2025 • 826

upvoted a paper 12 months ago

Enhancing Human-Like Responses in Large Language Models

Paper • 2501.05032 • Published Jan 9, 2025 • 60

upvoted a paper about 1 year ago

Were RNNs All We Needed?

Paper • 2410.01201 • Published Oct 2, 2024 • 53

upvoted 4 collections about 1 year ago

minGRU

Hugging Face integration of minGRU RNN models • 2 items • Updated Jan 7, 2025 • 4

Open LLM Leaderboard best models ❤️‍🔥

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 65 items • Updated Mar 20, 2025 • 657

FastLlama

A Faster and Higher-performing FastLlama Series • 4 items • Updated Dec 30, 2024 • 4

Merge Models

Models I merged using mergekit library • 8 items • Updated Apr 10, 2025 • 4