Elijah Wilt PRO

ooj

AI & ML interests

None yet

Recent Activity

liked a model 10 days ago

cognitivecomputations/Dolphin3.0-R1-Mistral-24B

liked a model 17 days ago

mistralai/Mistral-Small-24B-Instruct-2501

liked a model about 1 month ago

unsloth/DeepSeek-R1-GGUF

View all activity

Organizations

None yet

ooj's activity

upvoted 13 papers about 1 month ago

LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models

Paper • 2411.09595 • Published Nov 14, 2024 • 72

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 92

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 134

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 90

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 257

Entropy-Guided Attention for Private LLMs

Paper • 2501.03489 • Published Jan 7 • 14

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published Jan 9 • 89

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published Jan 10 • 67

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 84

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17 • 106

upvoted a collection about 1 month ago

Qwen2.5-Math

Collection

Math-specific model series based on Qwen2.5 • 11 items • Updated Jan 14 • 75

upvoted 5 papers about 1 month ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 91

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 273

Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography

Paper • 2501.08970 • Published Jan 15 • 6

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published Jan 16 • 69

The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models

Paper • 2501.09653 • Published Jan 16 • 12

upvoted a paper 8 months ago

Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117