Abreu Magalhães's picture

132 84

Abreu Magalhães

Hildeberto

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 16 days ago

Jailbreaking with Universal Multi-Prompts

upvoted a paper 16 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

upvoted a paper 27 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

View all activity

Organizations

None yet

Hildeberto's activity

upvoted 2 papers 16 days ago

Jailbreaking with Universal Multi-Prompts

Paper • 2502.01154 • Published 20 days ago • 8

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 19 days ago • 190

upvoted 3 papers 27 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 330

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published Jan 20 • 91

Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published Jan 20 • 32

upvoted a paper about 1 month ago

Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published Jan 8 • 85

liked a model about 2 months ago

answerdotai/ModernBERT-large

Fill-Mask • Updated Jan 15 • 667k • 354

upvoted a paper about 2 months ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 134

upvoted 8 papers 4 months ago

COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training

Paper • 2410.19313 • Published Oct 25, 2024 • 19

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Paper • 2410.21169 • Published Oct 28, 2024 • 30

MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published Oct 22, 2024 • 15

Pre-training Distillation for Large Language Models: A Design Space Exploration

Paper • 2410.16215 • Published Oct 21, 2024 • 16

Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception

Paper • 2410.12788 • Published Oct 16, 2024 • 24

Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

Paper • 2410.10814 • Published Oct 14, 2024 • 50

Toward General Instruction-Following Alignment for Retrieval-Augmented Generation

Paper • 2410.09584 • Published Oct 12, 2024 • 48

Agent S: An Open Agentic Framework that Uses Computers Like a Human

Paper • 2410.08164 • Published Oct 10, 2024 • 24

upvoted 3 papers 5 months ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 171

Selective Attention Improves Transformer

Paper • 2410.02703 • Published Oct 3, 2024 • 24

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 145

liked a model 5 months ago

pt-mteb/average_pt_nilc_glove_s300

Sentence Similarity • Updated Apr 17, 2024 • 1