10 38 151

PZ PRO

philipp-zettl

philipp-zettl

AI & ML interests

NLP/CV/Multimodal learning

Recent Activity

liked a dataset about 2 hours ago

SakanaAI/AI-CUDA-Engineer-Archive

new activity 2 days ago

philipp-zettl/chessPT:Any results?

liked a model 4 days ago

perplexity-ai/r1-1776

View all activity

Organizations

philipp-zettl's activity

liked a dataset about 2 hours ago

SakanaAI/AI-CUDA-Engineer-Archive

Viewer • Updated 4 days ago • 30.6k • 5.33k • 94

New activity in philipp-zettl/chessPT 2 days ago

Any results?

#2 opened 18 days ago by

AlvaroMros

liked a model 4 days ago

perplexity-ai/r1-1776

Updated 4 days ago • 7.75k • 1.53k

upvoted a paper 8 days ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 11 days ago • 139

liked a model 8 days ago

NousResearch/DeepHermes-3-Llama-3-8B-Preview-GGUF

Updated 7 days ago • 18.3k • 63

updated a model 12 days ago

philipp-zettl/T5-small-tinyqa

Text2Text Generation • Updated 12 days ago • 5

published a model 12 days ago

philipp-zettl/T5-small-tinyqa

Text2Text Generation • Updated 12 days ago • 5

liked a Space 13 days ago

Mixture Of Diffusers SDXL Tiling

🚀

Mixture of Diffusers implementation for XL Stable Diffusion

updated a model 13 days ago

philipp-zettl/chessPT

Text2Text Generation • Updated 13 days ago • 5

liked a model 13 days ago

nvidia/QLIP-L-14-392

Updated 14 days ago • 142 • 6

liked a Space 13 days ago

1.88k

QR Code AI Art Generator

📱

QR Code AI Art Generator Blend QR codes with AI Art

New activity in philipp-zettl/chessPT 13 days ago

Training Date Size

#3 opened 14 days ago by

nh185285

published a dataset 13 days ago

philipp-zettl/chessPT-data

Viewer • Updated Oct 8, 2024 • 6.88M • 50

updated a collection 13 days ago

not closed TTS

Collection

13 items • Updated 13 days ago

upvoted a collection 13 days ago

Express 🚅

Collection

Express Tiny LLM's • 6 items • Updated about 1 month ago • 9

liked a model 13 days ago

prithivMLmods/FastThink-0.5B-Tiny

Text Generation • Updated 24 days ago • 686 • 11

updated a collection 13 days ago

OCR

Collection

7 items • Updated 13 days ago

liked a model 13 days ago

prithivMLmods/Qwen2-VL-OCR-2B-Instruct

Image-Text-to-Text • Updated Jan 11 • 10.2k • 53

reacted to schuler's post with 🔥 14 days ago

Post

7217

📢 New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

🔑 Key Findings:
• 77% parameter reduction.
• Maintained model capabilities.
• Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm

2 replies

upvoted a paper 17 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 19 days ago • 190