view article Article Good answers are not necessarily factual answers: an analysis of hallucination in leading LLMs By davidberenstein1957 and 1 other • May 7 • 40
view article Article RealPerformance, A Dataset of Language Model Business Compliance Issues By davidberenstein1957 and 1 other • about 1 month ago • 4
view article Article LLMs recognise bias but also reproduce harmful stereotypes: an analysis of bias in leading LLMs By davidberenstein1957 and 3 others • Jul 2 • 16
RealHarm: A Collection of Real-World Language Model Application Failures Paper • 2504.10277 • Published Apr 14 • 11
The Big Benchmarks Collection Collection Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) • 13 items • Updated Nov 18, 2024 • 246
view article Article License to Call: Introducing Transformers Agents 2.0 By m-ric and 2 others • May 13, 2024 • 134
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22, 2024 • 128