Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2311.03285

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 31
Efficient Memory Management for Large Language Model Serving with PagedAttention

Paper • 2309.06180 • Published Sep 12, 2023 • 25
zhihan1996/DNABERT-2-117M

Updated Mar 18, 2024 • 142k • 63
AIRI-Institute/gena-lm-bert-base

Updated Jul 4, 2023 • 97 • 28

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 31
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

Paper • 2311.03354 • Published Nov 6, 2023 • 8

Inference/Serving

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 31

Standalone Neural Modules

Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs

Paper • 2311.02262 • Published Nov 3, 2023 • 14
S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 31

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 31
GLaMM: Pixel Grounding Large Multimodal Model

Paper • 2311.03356 • Published Nov 6, 2023 • 35
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

Paper • 2311.03354 • Published Nov 6, 2023 • 8

Alignment: FineTuning-Preference

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 31
Tailoring Self-Rationalizers with Multi-Reward Distillation

Paper • 2311.02805 • Published Nov 6, 2023 • 7
Ultra-Long Sequence Distributed Transformer

Paper • 2311.02382 • Published Nov 4, 2023 • 6
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

Paper • 2309.11235 • Published Sep 20, 2023 • 15

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 31

FlashDecoding++: Faster Large Language Model Inference on GPUs

Paper • 2311.01282 • Published Nov 2, 2023 • 36
S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 31
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Paper • 2311.06243 • Published Nov 10, 2023 • 22
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

Paper • 2311.05908 • Published Nov 10, 2023 • 16

PockEngine: Sparse and Efficient Fine-tuning in a Pocket

Paper • 2310.17752 • Published Oct 26, 2023 • 14
S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 31
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Paper • 2311.06243 • Published Nov 10, 2023 • 22
Fine-tuning Language Models for Factuality

Paper • 2311.08401 • Published Nov 14, 2023 • 29

Papers related to parameter efficient finetuning methods.

LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery

Paper • 2310.18356 • Published Oct 24, 2023 • 23
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

Paper • 2310.08659 • Published Oct 12, 2023 • 27
ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers

Paper • 2309.16119 • Published Sep 28, 2023 • 1
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Paper • 2309.14717 • Published Sep 26, 2023 • 44

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs