Improve model card: Add detailed description, usage, links, and training info for LoRI-D adapter

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +110 -99
README.md CHANGED
@@ -2,192 +2,203 @@
2
  base_model: meta-llama/Meta-Llama-3-8B
3
  library_name: peft
4
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
 
5
  ---
6
 
7
  # Model Card for LoRI-D_nlu_llama3_rank_32
8
 
9
  This model is part of [LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation](https://arxiv.org/abs/2504.07448).
10
 
11
- <!-- Provide a quick summary of what the model is/does. -->
12
-
13
 
 
 
 
14
 
15
  ## Model Details
16
 
17
  ### Model Description
18
 
19
- <!-- Provide a longer summary of what this model is. -->
20
-
21
-
22
-
23
- - **Developed by:** [More Information Needed]
24
- - **Funded by [optional]:** [More Information Needed]
25
- - **Shared by [optional]:** [More Information Needed]
26
- - **Model type:** [More Information Needed]
27
- - **Language(s) (NLP):** [More Information Needed]
28
- - **License:** [More Information Needed]
29
- - **Finetuned from model [optional]:** [More Information Needed]
30
 
31
- ### Model Sources [optional]
 
 
 
 
32
 
33
- <!-- Provide the basic links for the model. -->
34
 
35
- - **Repository:** [More Information Needed]
36
- - **Paper [optional]:** [More Information Needed]
37
- - **Demo [optional]:** [More Information Needed]
 
38
 
39
  ## Uses
40
 
41
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
42
-
43
  ### Direct Use
44
 
45
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
 
 
46
 
47
- [More Information Needed]
48
 
49
- ### Downstream Use [optional]
50
 
51
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
 
 
 
52
 
53
- [More Information Needed]
54
 
55
- ### Out-of-Scope Use
56
 
57
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
58
 
59
- [More Information Needed]
60
 
61
- ## Bias, Risks, and Limitations
62
 
63
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
64
 
65
- [More Information Needed]
 
 
 
66
 
67
- ### Recommendations
 
 
 
 
 
 
68
 
69
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 
 
70
 
71
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 
72
 
73
- ## How to Get Started with the Model
 
 
74
 
75
- Use the code below to get started with the model.
 
 
 
 
76
 
77
- [More Information Needed]
 
 
78
 
79
  ## Training Details
80
 
81
  ### Training Data
82
 
83
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
 
 
 
84
 
85
- [More Information Needed]
86
 
87
  ### Training Procedure
88
 
89
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
90
-
91
- #### Preprocessing [optional]
92
-
93
- [More Information Needed]
94
 
 
95
 
96
  #### Training Hyperparameters
97
 
98
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
99
-
100
- #### Speeds, Sizes, Times [optional]
101
-
102
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
103
-
104
- [More Information Needed]
105
 
106
  ## Evaluation
107
 
108
- <!-- This section describes the evaluation protocols and provides the results. -->
109
-
110
  ### Testing Data, Factors & Metrics
111
 
112
- #### Testing Data
113
 
114
- <!-- This should link to a Dataset Card if possible. -->
115
 
116
- [More Information Needed]
117
 
118
  #### Factors
119
 
120
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
121
-
122
- [More Information Needed]
123
 
124
  #### Metrics
125
 
126
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
127
-
128
- [More Information Needed]
129
 
130
  ### Results
131
 
132
- [More Information Needed]
133
-
134
- #### Summary
135
-
136
-
137
 
138
- ## Model Examination [optional]
139
-
140
- <!-- Relevant interpretability work for the model goes here -->
141
-
142
- [More Information Needed]
143
-
144
- ## Technical Specifications [optional]
145
 
146
  ### Model Architecture and Objective
147
 
148
- [More Information Needed]
149
 
150
  ### Compute Infrastructure
151
 
152
- [More Information Needed]
153
-
154
  #### Hardware
155
 
156
- [More Information Needed]
157
 
158
  #### Software
159
 
160
- [More Information Needed]
161
-
162
- ## Citation [optional]
163
 
164
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
165
 
166
- **BibTeX:**
167
 
168
- [More Information Needed]
 
 
 
 
 
 
 
169
 
170
  **APA:**
171
-
172
  [More Information Needed]
173
 
174
- ## Glossary [optional]
175
-
176
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
177
-
178
- [More Information Needed]
179
-
180
- ## More Information [optional]
181
-
182
- [More Information Needed]
183
-
184
- ## Model Card Authors [optional]
185
-
186
- [More Information Needed]
187
 
188
  ## Model Card Contact
 
189
 
190
- [More Information Needed]
191
  ### Framework versions
192
 
193
  - PEFT 0.12.0
 
2
  base_model: meta-llama/Meta-Llama-3-8B
3
  library_name: peft
4
  pipeline_tag: text-generation
5
+ license: apache-2.0
6
+ tags:
7
+ - lora
8
+ - peft
9
+ - multi-task-learning
10
+ - continual-learning
11
+ - nlu
12
+ - code-generation
13
+ - mathematical-reasoning
14
+ - safety-alignment
15
  ---
16
 
17
  # Model Card for LoRI-D_nlu_llama3_rank_32
18
 
19
  This model is part of [LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation](https://arxiv.org/abs/2504.07448).
20
 
21
+ **LoRI** (LoRA with Reduced Interference) is a simple yet effective parameter-efficient fine-tuning (PEFT) method designed for Large Language Models (LLMs) to mitigate notable overhead and address parameter interference in multi-task scenarios. It achieves this by freezing the projection matrices `A` as random projections and sparsifying the matrices `B` using task-specific masks. This design substantially reduces the number of trainable parameters while maintaining strong task performance. Moreover, LoRI minimizes cross-task interference in adapter merging by leveraging the orthogonality between adapter subspaces, and supports continual learning by using sparsity to mitigate catastrophic forgetting.
 
22
 
23
+ <div align="center">
24
+ <img src="https://github.com/juzhengz/LoRI/raw/main/LoRI.png" alt="LoRI" width="80%">
25
+ </div>
26
 
27
  ## Model Details
28
 
29
  ### Model Description
30
 
31
+ LoRI-D_nlu_llama3_rank_32 is a LoRI adapter specifically fine-tuned for Natural Language Understanding (NLU) tasks. It is based on the `meta-llama/Meta-Llama-3-8B` base model with a LoRA rank of 32. This model leverages LoRI's design to offer efficient fine-tuning, reduced parameter overhead, and minimized cross-task interference.
 
 
 
 
 
 
 
 
 
 
32
 
33
+ - **Developed by:** [Juzheng Zhang](https://juzhengz.github.io/), [Jiacheng You](https://github.com/YouJiacheng), [Ashwinee Panda](https://kiddyboots216.github.io/), [Tom Goldstein](https://www.cs.umd.edu/~tomg/)
34
+ - **Model type:** Low-Rank Adaptation (LoRA) variant / PEFT adapter
35
+ - **Language(s) (NLP):** English
36
+ - **License:** Apache 2.0
37
+ - **Finetuned from model:** `meta-llama/Meta-Llama-3-8B`
38
 
39
+ ### Model Sources
40
 
41
+ - **Repository:** [https://github.com/juzhengz/LoRI/](https://github.com/juzhengz/LoRI/)
42
+ - **Paper:** [https://arxiv.org/abs/2504.07448](https://arxiv.org/abs/2504.07448)
43
+ - **Project Page:** [https://juzhengz.github.io/](https://juzhengz.github.io/)
44
+ - **Hugging Face Collection:** [https://huggingface.co/collections/tomg-group-umd/lori-adapters-67f795549d792613e1290011](https://huggingface.co/collections/tomg-group-umd/lori-adapters-67f795549d792613e1290011)
45
 
46
  ## Uses
47
 
 
 
48
  ### Direct Use
49
 
50
+ This model is intended for use in fine-tuning Large Language Models for various tasks, including:
51
+ - Natural Language Understanding (NLU)
52
+ - Mathematical reasoning
53
+ - Code generation
54
+ - Safety alignment
55
 
56
+ It is particularly useful for researchers and practitioners looking for parameter-efficient fine-tuning solutions that reduce cross-task interference in multi-task and continual learning settings.
57
 
58
+ ### Out-of-Scope Use
59
 
60
+ This model is not intended for:
61
+ - Deployment in high-stakes applications requiring extremely high safety or ethical standards without further rigorous evaluation and fine-tuning.
62
+ - Generating content that is harmful, discriminatory, or promotes illegal activities.
63
+ - Use cases outside the specific tasks for which it has been fine-tuned without additional adaptation and validation.
64
+ - Use without a compatible base model, as it is a PEFT adapter.
65
 
66
+ ## Bias, Risks, and Limitations
67
 
68
+ As an adapter fine-tuned on a base Large Language Model (e.g., Llama-3-8B), this model inherits potential biases present in its foundational training data. This could lead to biased, harmful, or undesirable outputs. While LoRI aims to reduce cross-task interference, complex interactions in highly diverse multi-task setups might still occur. Performance on out-of-distribution data or tasks not covered during its training may vary.
69
 
70
+ ### Recommendations
71
 
72
+ Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. It is recommended to perform thorough evaluations specific to your application and data before deployment. Implement additional filtering or human-in-the-loop validation for critical use cases.
73
 
74
+ ## How to Get Started with the Model
75
 
76
+ Pretrained LoRI adapters are available on the Hugging Face Hub. To use this specific NLU adapter with its base model (`meta-llama/Meta-Llama-3-8B`), follow the example below:
77
 
78
+ ```python
79
+ from transformers import AutoModelForCausalLM, AutoTokenizer
80
+ from peft import PeftModel
81
+ import torch
82
 
83
+ # Load the base model (Meta-Llama-3-8B)
84
+ base_model_name = "meta-llama/Meta-Llama-3-8B"
85
+ base_model = AutoModelForCausalLM.from_pretrained(
86
+ base_model_name,
87
+ torch_dtype=torch.bfloat16, # Use bfloat16 for Llama-3 if your hardware supports it
88
+ device_map="auto" # Automatically maps model to available devices (e.g., GPU)
89
+ )
90
 
91
+ # Load the LoRI-D NLU adapter on top of the base model
92
+ lori_adapter_name = "tomg-group-umd/LoRI-D_nlu_llama3_rank_32"
93
+ model = PeftModel.from_pretrained(base_model, lori_adapter_name)
94
 
95
+ # Load the tokenizer
96
+ tokenizer = AutoTokenizer.from_pretrained(base_model_name)
97
 
98
+ # Example for text generation
99
+ prompt = "The quick brown fox jumps over the lazy"
100
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device) # Move inputs to model's device
101
 
102
+ # Generate text
103
+ model.eval() # Set model to evaluation mode
104
+ with torch.no_grad():
105
+ outputs = model.generate(**inputs, max_new_tokens=50, temperature=0.7, do_sample=True)
106
+ generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
107
 
108
+ print(f"Prompt: {prompt}")
109
+ print(f"Generated: {generated_text}")
110
+ ```
111
 
112
  ## Training Details
113
 
114
  ### Training Data
115
 
116
+ LoRI models are trained and evaluated on a variety of datasets covering different tasks:
117
+ - **Natural Language Understanding (NLU):** Specific NLU datasets (details in the main repository and paper).
118
+ - **Code Generation:** CodeAlpaca dataset.
119
+ - **Mathematical Reasoning:** GSM8K dataset.
120
+ - **Safety Alignment:** Saferpaca dataset.
121
 
122
+ For more detailed information on specific datasets used for each task, please refer to the [LoRI GitHub repository](https://github.com/juzhengz/LoRI/) and the accompanying [paper](https://arxiv.org/abs/2504.07448).
123
 
124
  ### Training Procedure
125
 
126
+ LoRI is implemented using Fully Sharded Data Parallel (FSDP) and can be executed in a multi-GPU environment. The training process typically involves two stages:
127
+ 1. **LoRI-D Training:** Initial training of the LoRI adapter (Decomposed LoRA) to extract sparse masks.
128
+ 2. **LoRI-S Training:** Continued training (Sparsified LoRA) at 90% sparsity, leveraging the extracted masks.
 
 
129
 
130
+ The provided training scripts in the GitHub repository support LLaMA-3-8B and Mistral-7B base models with adapter ranks of 32 and 64, performing both `LoRI-D` and `LoRI-S` training, followed by evaluation on downstream tasks.
131
 
132
  #### Training Hyperparameters
133
 
134
+ The specific LoRA parameters for this `LoRI-D_nlu_llama3_rank_32` adapter, as extracted from `adapter_config.json`, are:
135
+ - `r`: 32
136
+ - `lora_alpha`: 64
137
+ - `lora_dropout`: 0.05
138
+ - `target_modules`: `["down_proj", "up_proj", "k_proj", "gate_proj", "v_proj", "q_proj", "o_proj"]`
139
+ - `peft_type`: `LORA`
140
+ - **Training regime:** Mixed precision (commonly fp16 or bf16 for LLMs).
141
 
142
  ## Evaluation
143
 
 
 
144
  ### Testing Data, Factors & Metrics
145
 
146
+ The model's performance was evaluated extensively across various benchmarks relevant to Natural Language Understanding, Code Generation, Mathematical Reasoning, and Safety Alignment.
147
 
148
+ #### Testing Data
149
 
150
+ Performance was evaluated on standard benchmarks for each task. Specific datasets include HumanEval for code generation, GSM8K for mathematical reasoning, and Saferpaca for safety alignment.
151
 
152
  #### Factors
153
 
154
+ Evaluations disaggregated by tasks (NLU, code, math, safety) and potentially by model size/rank.
 
 
155
 
156
  #### Metrics
157
 
158
+ Performance was measured using standard metrics relevant to each task (e.g., accuracy for NLU/math, pass@1 for code generation).
 
 
159
 
160
  ### Results
161
 
162
+ Extensive experiments demonstrated that LoRI consistently outperforms full fine-tuning and existing PEFT methods, while using up to 95% fewer trainable parameters than standard LoRA. In multi-task experiments, LoRI enabled effective adapter merging and continual learning with significantly reduced cross-task interference. For detailed evaluation results and comparisons, please refer to the [LoRI paper](https://arxiv.org/abs/2504.07448).
 
 
 
 
163
 
164
+ ## Technical Specifications
 
 
 
 
 
 
165
 
166
  ### Model Architecture and Objective
167
 
168
+ LoRI introduces modifications to the standard LoRA architecture. It freezes the low-rank projection matrix `A` as random projections and sparsifies the `B` matrix using task-specific masks. This approach aims to achieve mutual orthogonality between adapter subspaces, reduce cross-task interference, and significantly decrease the number of trainable parameters.
169
 
170
  ### Compute Infrastructure
171
 
 
 
172
  #### Hardware
173
 
174
+ Training and evaluation were conducted on GPUs. Specific details on hardware configurations might be found within the supplementary materials of the paper or the GitHub repository's training scripts.
175
 
176
  #### Software
177
 
178
+ The codebase primarily leverages PyTorch for deep learning, along with the `transformers` and `peft` libraries from Hugging Face.
 
 
179
 
180
+ ## Citation
181
 
182
+ If you use LoRI in your work, please cite the following paper:
183
 
184
+ ```bibtex
185
+ @article{zhang2025lori,
186
+ title={LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation},
187
+ author={Zhang, Juzheng and You, Jiacheng and Panda, Ashwinee and Goldstein, Tom},
188
+ journal={arXiv preprint arXiv:2504.07448},
189
+ year={2025}
190
+ }
191
+ ```
192
 
193
  **APA:**
 
194
  [More Information Needed]
195
 
196
+ ## Model Card Authors
197
+ Niels, Hugging Face Community Science team.
 
 
 
 
 
 
 
 
 
 
 
198
 
199
  ## Model Card Contact
200
+ For questions about the model or LoRI project, please contact [[email protected]](mailto:[email protected]).
201
 
 
202
  ### Framework versions
203
 
204
  - PEFT 0.12.0