9 20 1

Andrei Semenov

Andron00e

https://andron00e.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a collection 3 days ago

Apertus LLM

commented on a paper 3 days ago

Benchmarking Optimizers for Large Language Model Pretraining

authored a paper 3 days ago

Gradient Clipping Improves AdaGrad when the Noise Is Heavy-Tailed

View all activity

Organizations

upvoted a collection 3 days ago

Apertus LLM

Collection

4 items • Updated 4 days ago • 194

commented a paper 3 days ago

Benchmarking Optimizers for Large Language Model Pretraining

Paper • 2509.01440 • Published 5 days ago • 21 •

authored 2 papers 3 days ago

Gradient Clipping Improves AdaGrad when the Noise Is Heavy-Tailed

Paper • 2406.04443 • Published Jun 6, 2024

Benchmarking Optimizers for Large Language Model Pretraining

Paper • 2509.01440 • Published 5 days ago • 21

updated a model 3 months ago

Andron00e/MNLP_M2_quantized_model

Text Generation • 0.2B • Updated Jun 10 • 19

published a model 3 months ago

Andron00e/MNLP_M2_quantized_model

Text Generation • 0.2B • Updated Jun 10 • 19

updated a model 3 months ago

Andron00e/MNLP_M2_mcqa_calibration

Updated Jun 10

published a model 3 months ago

Andron00e/MNLP_M2_mcqa_calibration

Updated Jun 10

published a dataset 3 months ago

Andron00e/MNLP_M2_mcqa_calibration

Updated Jun 10 • 2

updated a model 3 months ago

Andron00e/MNLP_M2_mcqa_model-4bit-gptq

Text Generation • 0.2B • Updated Jun 10 • 9

published a model 3 months ago

Andron00e/MNLP_M2_mcqa_model-4bit-gptq

Text Generation • 0.2B • Updated Jun 10 • 9

upvoted a paper 7 months ago

LLM Pretraining with Continuous Concepts

Paper • 2502.08524 • Published Feb 12 • 29

upvoted a paper 9 months ago

Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning

Paper • 2412.11689 • Published Dec 16, 2024 • 2

commented a paper 9 months ago

Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning

Paper • 2412.11689 • Published Dec 16, 2024 • 2 •

upvoted a paper 10 months ago

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28, 2024 • 84

updated a Space 11 months ago

README

🏢

upvoted a paper 12 months ago

Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization

Paper • 2409.00492 • Published Aug 31, 2024 • 11

updated 2 Spaces about 1 year ago

Mamba Noise

🐠

README

📊

upvoted a collection about 1 year ago

MatMulfree LM

Collection

Pre-trined models for Matmulfree LM. • 4 items • Updated Jun 10, 2024 • 25

Andrei Semenov

AI & ML interests

Recent Activity

Organizations

Andron00e's activity

README

Mamba Noise

README