Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2405.01535

Curated resources that support the use of LLMs to serve as automatic evaluators of other LLM outputs.

JudgeLM: Fine-tuned Large Language Models are Scalable Judges

Paper • 2310.17631 • Published Oct 26, 2023 • 34
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 54
Generative Judge for Evaluating Alignment

Paper • 2310.05470 • Published Oct 9, 2023 • 1
Calibrating LLM-Based Evaluator

Paper • 2309.13308 • Published Sep 23, 2023 • 12

A Zero-Shot Language Agent for Computer Control with Structured Reflection

Paper • 2310.08740 • Published Oct 12, 2023 • 16
AgentTuning: Enabling Generalized Agent Abilities for LLMs

Paper • 2310.12823 • Published Oct 19, 2023 • 35
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

Paper • 2308.10848 • Published Aug 21, 2023 • 1
CLEX: Continuous Length Extrapolation for Large Language Models

Paper • 2310.16450 • Published Oct 25, 2023 • 10

Previous
1
...
4
5
6
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs