Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2309.16039

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 88
Effective Long-Context Scaling of Foundation Models

Paper • 2309.16039 • Published Sep 27, 2023 • 30

interesting stuff

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 83

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

Paper • 2309.09958 • Published Sep 18, 2023 • 19
TextBind: Multi-turn Interleaved Multimodal Instruction-following

Paper • 2309.08637 • Published Sep 14, 2023 • 8
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model

Paper • 2309.16058 • Published Sep 27, 2023 • 55
Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 35

Towards LLMs for grown-ups who need to read/write more than a couple of paragraphs.

tiiuae/falcon-180B

Text Generation • Updated Sep 6, 2023 • 6.69k • 1.14k
tiiuae/falcon-180B-chat

Text Generation • Updated Nov 7, 2023 • 109k • 544
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models

Paper • 2309.14509 • Published Sep 25, 2023 • 18
Effective Long-Context Scaling of Foundation Models

Paper • 2309.16039 • Published Sep 27, 2023 • 30

From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting

Paper • 2309.04269 • Published Sep 8, 2023 • 33
FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning

Paper • 2309.04663 • Published Sep 9, 2023 • 6
Effective Long-Context Scaling of Foundation Models

Paper • 2309.16039 • Published Sep 27, 2023 • 30
Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts

Paper • 2310.11784 • Published Oct 18, 2023 • 11

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Paper • 2309.04662 • Published Sep 9, 2023 • 23
Neurons in Large Language Models: Dead, N-gram, Positional

Paper • 2309.04827 • Published Sep 9, 2023 • 17
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Paper • 2309.05516 • Published Sep 11, 2023 • 10
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

Paper • 2309.03907 • Published May 18, 2023 • 12

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs