Mechanistic Permutability: Match Features Across Layers Paper • 2410.07656 • Published Oct 10, 2024 • 18
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation Paper • 2409.06820 • Published Sep 10, 2024 • 64
A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data Paper • 2407.16680 • Published Jul 23, 2024 • 12
hiieu/Meta-Llama-3-8B-Instruct-function-calling-json-mode Text Generation • Updated May 15, 2024 • 170 • 73