Text Generation
PEFT
Safetensors
lora
code-generation
llama
nielsr HF Staff commited on
Commit
9236a83
·
verified ·
1 Parent(s): d921df3

Improve model card for LoRI-S_code_llama3_rank_64

Browse files

This PR significantly enhances the model card for `tomg-group-umd/LoRI-S_code_llama3_rank_64` by adding comprehensive information and improving its discoverability and usability.

Key updates include:
- **Metadata Enrichment:** Adding `license: apache-2.0`, and relevant `tags` such as `peft`, `lora`, `code-generation`, and `llama`, along with the specific `datasets` used for training this model.
- **Detailed Model Description:** Populating the "Model Details" section with information about the developers, model type, language, and the base model, based on the paper abstract and GitHub repository.
- **Complete Model Sources:** Adding direct links to the official GitHub repository, the Hugging Face paper page, the project page, and the Hugging Face collection.
- **Elaborated Usage Instructions:** Filling in "Uses" sections (Direct Use, Downstream Use, Out-of-Scope) to clarify the model's intended applications and limitations.
- **Executable Code Snippet:** Providing a runnable Python code example in "How to Get Started" for quick inference using `transformers` and `peft`.
- **Training Information:** Detailing the "Training Data" and "Training Procedure" (LoRI-D and LoRI-S stages, FSDP) and "Training Hyperparameters" (rank, sparsity, etc.).
- **Evaluation Summary:** Summarizing key evaluation aspects and directing users to the paper for detailed results.
- **Citation:** Including the BibTeX entry from the paper.
- **Visual Aid:** Embedding the LoRI architecture diagram from the GitHub repository.

This update makes the model card much more informative and user-friendly for researchers and practitioners.

Files changed (1) hide show
  1. README.md +132 -121
README.md CHANGED
@@ -2,192 +2,203 @@
2
  base_model: meta-llama/Meta-Llama-3-8B
3
  library_name: peft
4
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
5
  ---
6
 
7
  # Model Card for LoRI-S_code_llama3_rank_64
8
 
9
  This model is part of [LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation](https://arxiv.org/abs/2504.07448).
10
 
11
- <!-- Provide a quick summary of what the model is/does. -->
12
-
13
 
 
 
 
14
 
15
  ## Model Details
16
 
17
  ### Model Description
18
 
19
- <!-- Provide a longer summary of what this model is. -->
20
-
21
-
22
 
23
- - **Developed by:** [More Information Needed]
24
- - **Funded by [optional]:** [More Information Needed]
25
- - **Shared by [optional]:** [More Information Needed]
26
- - **Model type:** [More Information Needed]
27
- - **Language(s) (NLP):** [More Information Needed]
28
- - **License:** [More Information Needed]
29
- - **Finetuned from model [optional]:** [More Information Needed]
30
 
31
- ### Model Sources [optional]
32
 
33
- <!-- Provide the basic links for the model. -->
34
-
35
- - **Repository:** [More Information Needed]
36
- - **Paper [optional]:** [More Information Needed]
37
- - **Demo [optional]:** [More Information Needed]
38
 
39
  ## Uses
40
 
41
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
42
-
43
  ### Direct Use
44
 
45
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
46
-
47
- [More Information Needed]
48
 
49
- ### Downstream Use [optional]
50
 
51
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
52
-
53
- [More Information Needed]
54
 
55
  ### Out-of-Scope Use
56
 
57
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
58
-
59
- [More Information Needed]
60
 
61
  ## Bias, Risks, and Limitations
62
 
63
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
64
-
65
- [More Information Needed]
 
66
 
67
  ### Recommendations
68
 
69
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
70
-
71
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
72
 
73
  ## How to Get Started with the Model
74
 
75
- Use the code below to get started with the model.
76
-
77
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
 
79
  ## Training Details
80
 
81
  ### Training Data
82
 
83
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
84
-
85
- [More Information Needed]
 
86
 
87
  ### Training Procedure
88
 
89
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
90
-
91
- #### Preprocessing [optional]
92
-
93
- [More Information Needed]
94
-
95
 
96
  #### Training Hyperparameters
97
 
98
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
99
-
100
- #### Speeds, Sizes, Times [optional]
101
-
102
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
103
-
104
- [More Information Needed]
105
 
106
  ## Evaluation
107
 
108
- <!-- This section describes the evaluation protocols and provides the results. -->
109
-
110
- ### Testing Data, Factors & Metrics
111
-
112
- #### Testing Data
113
-
114
- <!-- This should link to a Dataset Card if possible. -->
115
-
116
- [More Information Needed]
117
-
118
- #### Factors
119
-
120
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
121
-
122
- [More Information Needed]
123
-
124
- #### Metrics
125
-
126
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
127
-
128
- [More Information Needed]
129
 
130
  ### Results
131
 
132
- [More Information Needed]
133
-
134
- #### Summary
135
 
136
-
137
-
138
- ## Model Examination [optional]
139
-
140
- <!-- Relevant interpretability work for the model goes here -->
141
-
142
- [More Information Needed]
143
-
144
- ## Technical Specifications [optional]
145
 
146
  ### Model Architecture and Objective
147
 
148
- [More Information Needed]
149
 
150
  ### Compute Infrastructure
151
 
152
- [More Information Needed]
153
-
154
- #### Hardware
155
-
156
- [More Information Needed]
157
-
158
  #### Software
159
 
160
- [More Information Needed]
161
-
162
- ## Citation [optional]
163
-
164
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
165
-
166
- **BibTeX:**
167
-
168
- [More Information Needed]
169
 
170
- **APA:**
171
 
172
- [More Information Needed]
173
 
174
- ## Glossary [optional]
 
 
 
 
 
 
 
175
 
176
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
177
 
178
- [More Information Needed]
179
-
180
- ## More Information [optional]
181
-
182
- [More Information Needed]
183
-
184
- ## Model Card Authors [optional]
185
-
186
- [More Information Needed]
187
 
188
  ## Model Card Contact
189
 
190
- [More Information Needed]
191
- ### Framework versions
192
-
193
- - PEFT 0.12.0
 
2
  base_model: meta-llama/Meta-Llama-3-8B
3
  library_name: peft
4
  pipeline_tag: text-generation
5
+ license: apache-2.0
6
+ tags:
7
+ - peft
8
+ - lora
9
+ - code-generation
10
+ - llama
11
+ datasets:
12
+ - CodeAlpaca
13
  ---
14
 
15
  # Model Card for LoRI-S_code_llama3_rank_64
16
 
17
  This model is part of [LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation](https://arxiv.org/abs/2504.07448).
18
 
19
+ LoRI (LoRA with Reduced Interference) is a simple yet effective parameter-efficient fine-tuning (PEFT) method for Large Language Models (LLMs). It addresses common issues like notable overhead and parameter interference in multi-task scenarios by freezing the projection matrices `A` as random projections and sparsifying the matrices `B` using task-specific masks. This design substantially reduces the number of trainable parameters while maintaining strong task performance, minimizing cross-task interference in adapter merging, and supporting continual learning by mitigating catastrophic forgetting.
 
20
 
21
+ <div align="center">
22
+ <img src="https://github.com/juzhengz/LoRI/raw/main/LoRI.png" alt="LoRI Architecture" width="80%">
23
+ </div>
24
 
25
  ## Model Details
26
 
27
  ### Model Description
28
 
29
+ LoRI-S_code_llama3_rank_64 is a specific LoRI-S (Sparse) adapter trained for code generation tasks. It is built upon the `meta-llama/Meta-Llama-3-8B` base model with an adapter rank of 64. The LoRI approach has been demonstrated to outperform full fine-tuning and existing PEFT methods, using up to 95% fewer trainable parameters than standard LoRA. This model is part of a broader set of LoRI adapters that cover natural language understanding, mathematical reasoning, code generation, and safety alignment tasks.
 
 
30
 
31
+ - **Developed by:** Juzheng Zhang, Jiacheng You, Ashwinee Panda, Tom Goldstein
32
+ - **Model type:** Low-Rank Adaptation (LoRA) variant (LoRI-S), Parameter-Efficient Fine-Tuning (PEFT) adapter for Causal Language Models.
33
+ - **Language(s) (NLP):** English
34
+ - **License:** Apache-2.0
35
+ - **Finetuned from model:** `meta-llama/Meta-Llama-3-8B`
 
 
36
 
37
+ ### Model Sources
38
 
39
+ - **Repository:** [https://github.com/juzhengz/LoRI](https://github.com/juzhengz/LoRI)
40
+ - **Paper:** [https://huggingface.co/papers/2504.07448](https://huggingface.co/papers/2504.07448)
41
+ - **Project Page:** [https://juzhengz.github.io/](https://juzhengz.github.io/)
42
+ - **Hugging Face Collection:** [https://huggingface.co/collections/tomg-group-umd/lori-adapters-67f795549d792613e1290011](https://huggingface.co/collections/tomg-group-umd/lori-adapters-67f795549d792613e1290011)
 
43
 
44
  ## Uses
45
 
 
 
46
  ### Direct Use
47
 
48
+ This model is intended to be used as a PEFT adapter to efficiently fine-tune or enhance the `meta-llama/Meta-Llama-3-8B` base model specifically for code generation tasks. It should be loaded using the Hugging Face `PEFT` library on top of the base LLM.
 
 
49
 
50
+ ### Downstream Use
51
 
52
+ LoRI adapters are particularly designed for multi-task scenarios and continual learning, where they enable effective adapter merging and reduce cross-task interference. This model can be combined with other LoRI adapters for different tasks to build more robust multi-task systems.
 
 
53
 
54
  ### Out-of-Scope Use
55
 
56
+ This model is not intended for standalone use; it strictly requires the `meta-llama/Meta-Llama-3-8B` as its base model. Like all large language models, it may generate biased, harmful, or factually incorrect content, and should not be used in critical applications without thorough evaluation and additional safeguards.
 
 
57
 
58
  ## Bias, Risks, and Limitations
59
 
60
+ While LoRI aims to reduce interference and parameter overhead, the model may still inherit biases present in its pre-training or fine-tuning data (e.g., CodeAlpaca, Meta-Llama-3-8B's pre-training data). Potential risks and limitations include:
61
+ - **Generalization:** Performance may degrade on code generation tasks significantly different from its training distribution.
62
+ - **Factual Accuracy:** Generated code or comments may not always be logically sound or factually correct.
63
+ - **Safety:** The model may generate insecure or malicious code, or outputs that perpetuate stereotypes or harmful content if not properly constrained.
64
 
65
  ### Recommendations
66
 
67
+ Users (both direct and downstream) should be aware of these potential issues and implement appropriate validation and filtering mechanisms for the model's outputs. It is recommended to apply responsible AI practices and conduct task-specific evaluations.
 
 
68
 
69
  ## How to Get Started with the Model
70
 
71
+ Use the code below to get started with the model:
72
+
73
+ ```python
74
+ import torch
75
+ from transformers import AutoModelForCausalLM, AutoTokenizer
76
+ from peft import PeftModel
77
+
78
+ # 1. Load the base model
79
+ base_model_name = "meta-llama/Meta-Llama-3-8B"
80
+ base_model = AutoModelForCausalLM.from_pretrained(
81
+ base_model_name,
82
+ torch_dtype=torch.bfloat16, # Llama 3 models often use bfloat16
83
+ device_map="auto", # Load model onto available devices (GPU if available)
84
+ low_cpu_mem_usage=True # Optimize CPU memory usage
85
+ )
86
+
87
+ # 2. Load the LoRI adapter
88
+ # Replace "tomg-group-umd/LoRI-S_code_llama3_rank_64" with the correct model ID if different
89
+ adapter_model_id = "tomg-group-umd/LoRI-S_code_llama3_rank_64"
90
+ adapter_model = PeftModel.from_pretrained(base_model, adapter_model_id)
91
+
92
+ # 3. Load the tokenizer
93
+ tokenizer = AutoTokenizer.from_pretrained(base_model_name)
94
+ # Set pad_token if not already set, crucial for batching/generation
95
+ if tokenizer.pad_token is None:
96
+ tokenizer.pad_token = tokenizer.eos_token # Or another appropriate token
97
+
98
+ # 4. Set the model to evaluation mode
99
+ adapter_model.eval()
100
+
101
+ # 5. Prepare your input prompt for code generation
102
+ prompt = '''
103
+ def bubble_sort(arr):
104
+ n = len(arr)
105
+ for i in range(n - 1):
106
+ for j in range(0, n - i - 1):
107
+ if arr[j] > arr[j + 1]:
108
+ arr[j], arr[j + 1] = arr[j + 1], arr[j]
109
+ return arr
110
+
111
+ # Write a docstring for the function above, describing its purpose and parameters.
112
+ '''
113
+
114
+ # Encode the prompt and move to the model's device
115
+ input_ids = tokenizer.encode(prompt, return_tensors="pt").to(adapter_model.device)
116
+
117
+ # 6. Generate output
118
+ with torch.no_grad():
119
+ output_ids = adapter_model.generate(
120
+ input_ids,
121
+ max_new_tokens=100,
122
+ do_sample=True, # Sample outputs
123
+ temperature=0.01, # Low temperature for less randomness, more deterministic code
124
+ top_p=0.95, # Nucleus sampling
125
+ num_return_sequences=1,
126
+ eos_token_id=tokenizer.eos_token_id,
127
+ pad_token_id=tokenizer.pad_token_id,
128
+ )
129
+
130
+ # Decode and print the generated text
131
+ generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
132
+ print(generated_text)
133
+
134
+ # Optional: Merge adapter weights into the base model for easier deployment
135
+ # merged_model = adapter_model.merge_and_unload()
136
+ # merged_model.save_pretrained("path/to/merged-lori-model")
137
+ ```
138
 
139
  ## Training Details
140
 
141
  ### Training Data
142
 
143
+ This `LoRI-S_code_llama3_rank_64` adapter was specifically fine-tuned on the **CodeAlpaca** dataset for code generation tasks. The LoRI paper also describes experiments on:
144
+ - **Natural Language Understanding (NLU):** GLUE benchmark
145
+ - **Mathematical Reasoning:** GSM8K dataset
146
+ - **Safety Alignment:** Saferpaca dataset
147
 
148
  ### Training Procedure
149
 
150
+ LoRI training typically involves a two-stage process, implemented using Fully Sharded Data Parallel (FSDP) for efficient multi-GPU training:
151
+ 1. **LoRI-D (Dense) Training:** An initial phase where the projection matrices `A` are frozen as random projections, and the `B` matrices are trained densely.
152
+ 2. **Mask Extraction:** After `LoRI-D` training, sparse masks are extracted from the learned `B` matrices. For `LoRI-S` models, a high sparsity level (e.g., 90%) is typically applied.
153
+ 3. **LoRI-S (Sparse) Training:** The model continues training using these extracted sparse masks. This particular model, `LoRI-S_code_llama3_rank_64`, is the result of this sparsified training phase.
 
 
154
 
155
  #### Training Hyperparameters
156
 
157
+ - **Base Model:** `meta-llama/Meta-Llama-3-8B`
158
+ - **Adapter Rank (`r`):** 64
159
+ - **LoRA Alpha (`lora_alpha`):** 128
160
+ - **LoRA Dropout (`lora_dropout`):** 0.05
161
+ - **Sparsity (for LoRI-S phase):** 90%
162
+ - **Training Regime:** Mixed precision (bf16 for Llama 3 models)
 
163
 
164
  ## Evaluation
165
 
166
+ LoRI models have been extensively evaluated across natural language understanding, mathematical reasoning, code generation, and safety alignment tasks. Experiments demonstrate that LoRI outperforms full fine-tuning and existing PEFT methods, while using significantly fewer trainable parameters (up to 95% less than LoRA). In multi-task settings, LoRI enables effective adapter merging and continual learning with reduced cross-task interference.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
167
 
168
  ### Results
169
 
170
+ For detailed quantitative results, specific metrics (e.g., HumanEval for code generation, SuperGLUE for NLU, GSM8K for math), and comprehensive comparisons against baselines, please refer to the [official paper](https://huggingface.co/papers/2504.07448).
 
 
171
 
172
+ ## Technical Specifications
 
 
 
 
 
 
 
 
173
 
174
  ### Model Architecture and Objective
175
 
176
+ LoRI introduces a modification to the standard LoRA architecture where the projection matrices `A` are fixed as random projections, and the matrices `B` are sparsified using task-specific masks. This design is aimed at reducing cross-task interference in multi-task learning and mitigating catastrophic forgetting in continual learning scenarios.
177
 
178
  ### Compute Infrastructure
179
 
 
 
 
 
 
 
180
  #### Software
181
 
182
+ - PEFT 0.12.0
183
+ - Transformers (compatible with versions supporting Llama 3 and PEFT)
 
 
 
 
 
 
 
184
 
185
+ ## Citation
186
 
187
+ If you use LoRI in your work, please cite:
188
 
189
+ ```bibtex
190
+ @article{zhang2025lori,
191
+ title={LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation},
192
+ author={Zhang, Juzheng and You, Jiacheng and Panda, Ashwinee and Goldstein, Tom},
193
+ journal={arXiv preprint arXiv:2504.07448},
194
+ year={2025}
195
+ }
196
+ ```
197
 
198
+ ## Model Card Authors
199
 
200
+ Niels Rogge (Hugging Face Community Science Team)
 
 
 
 
 
 
 
 
201
 
202
  ## Model Card Contact
203
 
204