Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.01335

shisa-v2-research

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 66
Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28, 2024 • 97
argilla/magpie-ultra-v1.0

Viewer • Updated Nov 26, 2024 • 3.22M • 6.85k • 41
simplescaling/s1K-1.1

Viewer • Updated 13 days ago • 1k • 2.82k • 54

Tradecraft Patterns

Methods and analysis for generating synthetic data to populate graphs at scale, based on network motif (patterns) of tradecraft.

InGram: Inductive Knowledge Graph Embedding via Relation Graphs

Paper • 2305.19987 • Published May 31, 2023 • 2
Curating Grounded Synthetic Data with Global Perspectives for Equitable A

Paper • 2406.10258 • Published Jun 10, 2024 • 1
Peregrine: A Pattern-Aware Graph Mining System

Paper • 2004.02369 • Published Apr 6, 2020 • 1
OFFER: A Motif Dimensional Framework for Network Representation Learning

Paper • 2008.12010 • Published Aug 27, 2020 • 1

Synthetic Data Generation

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 142
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 87
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

Paper • 2305.07759 • Published May 12, 2023 • 33
Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28, 2024 • 97

Papers - Fine-tuning

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Paper • 2310.20587 • Published Oct 31, 2023 • 18
SELF: Language-Driven Self-Evolution for Large Language Model

Paper • 2310.00533 • Published Oct 1, 2023 • 2
QLoRA: Efficient Finetuning of Quantized LLMs

Paper • 2305.14314 • Published May 23, 2023 • 49
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Paper • 2309.14717 • Published Sep 26, 2023 • 44

Self Improvement

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2, 2024 • 65

DIBT Prompt collective SPIN

This collection contains resources related to the replication of SPIN with the dibt prompt collective dataset

argilla/zephyr-7b-spin-iter0-v0

Text Generation • Updated Mar 13, 2024 • 17 • 1
argilla/zephyr-7b-spin-iter1-v0

Text Generation • Updated Mar 13, 2024 • 24 • 1
argilla/zephyr-7b-spin-iter2-v0

Text Generation • Updated Mar 13, 2024 • 19 • 1
argilla/zephyr-7b-spin-iter3-v0

Text Generation • Updated Mar 13, 2024 • 22 • 8

about 23 hours ago

ibm-research/AttaQ

Viewer • Updated Jan 26, 2024 • 1.4k • 1.25k • 15
snorkelai/snorkel-curated-instruction-tuning

Preview • Updated Mar 11, 2024 • 136 • 8
corbyrosset/researchy_questions

Viewer • Updated Feb 29, 2024 • 96.4k • 138 • 25
argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 508 • 70

A Critical Evaluation of AI Feedback for Aligning Large Language Models

Paper • 2402.12366 • Published Feb 19, 2024 • 3
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Paper • 2401.08417 • Published Jan 16, 2024 • 35
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Paper • 2404.14723 • Published Apr 23, 2024 • 10
Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1, 2024 • 27

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 105
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15, 2024 • 42
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15, 2024 • 22
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15, 2024 • 37

Learning from feedback dir

Suppressing Pink Elephants with Direct Principle Feedback

Paper • 2402.07896 • Published Feb 12, 2024 • 11
Policy Improvement using Language Feedback Models

Paper • 2402.07876 • Published Feb 12, 2024 • 9
Direct Language Model Alignment from Online AI Feedback

Paper • 2402.04792 • Published Feb 7, 2024 • 31
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2, 2024 • 65

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs