--- base_model: meta-llama/Meta-Llama-3-8B library_name: peft pipeline_tag: text-generation license: apache-2.0 --- # Model Card for LoRI-D_nlu_llama3_rank_64 This model is part of [LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation](https://arxiv.org/abs/2504.07448). This is an adapter model based on the paper **LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation**, which introduces a simple yet effective approach to Low-Rank Adaptation (LoRA) for Large Language Models (LLMs). LoRI freezes the projection matrices A as random projections and sparsifies the matrices B using task-specific masks. This design substantially reduces the number of trainable parameters while maintaining strong task performance, minimizes cross-task interference in adapter merging, and supports continual learning by using sparsity to mitigate catastrophic forgetting.
LoRI Framework
### ✨ Key Highlights * **Scalable & Efficient**: Uses up to 95% fewer trainable parameters than traditional LoRA while maintaining performance. * **Reduced Interference**: Minimizes cross-task interference in multi-task scenarios by leveraging orthogonality between adapter subspaces. * **Continual Learning**: Supports continual learning by using sparsity to mitigate catastrophic forgetting. * **Universal Applicability**: Evaluated across natural language understanding, mathematical reasoning, code generation, and safety alignment tasks. ## Model Details ### Model Description The `LoRI-D_nlu_llama3_rank_64` model is a LoRA adapter specifically designed for Natural Language Understanding (NLU) tasks, fine-tuned on the `meta-llama/Meta-Llama-3-8B` base model with a rank of 64. It is part of the LoRI family of models, which aims to provide parameter-efficient fine-tuning with reduced cross-task interference. - **Developed by:** Juzheng Zhang, Jiacheng You, Ashwinee Panda, Tom Goldstein - **Model type:** Low-Rank Adaptation (LoRI) adapter (PEFT method for LLMs) - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Finetuned from model:** `meta-llama/Meta-Llama-3-8B` ### Model Sources - **Repository:** [https://github.com/juzhengz/LoRI/](https://github.com/juzhengz/LoRI/) - **Paper:** [https://arxiv.org/abs/2504.07448](https://arxiv.org/abs/2504.07448) - **HuggingFace Collection:** [https://huggingface.co/collections/tomg-group-umd/lori-adapters-67f795549d792613e1290011](https://huggingface.co/collections/tomg-group-umd/lori-adapters-67f795549d792613e1290011) ## Uses ### Direct Use This model is intended to be used as a PEFT adapter on top of the `meta-llama/Meta-Llama-3-8B` base model for natural language understanding tasks, leveraging its efficient design for reduced parameter overhead and improved multi-task performance. ### Downstream Use LoRI adapters can be merged for multi-task applications or sequentially applied for continual learning without significant performance degradation. This makes LoRI suitable for building generalist agents or systems that need to learn new skills over time. ### Out-of-Scope Use This model is not intended for use in high-stakes or safety-critical applications without further rigorous testing and validation. Given its focus on NLU tasks, its performance on other domains or tasks without specific fine-tuning is not guaranteed. ## Bias, Risks, and Limitations As with any language model, this model may inherit biases present in its training data, including the base model (`Llama-3-8B`) and the datasets used for LoRI fine-tuning. Potential risks include generating biased, inaccurate, or harmful content. ### Recommendations Users should carefully evaluate the model's output for their specific application and consider fine-tuning on domain-specific, curated data to mitigate potential biases or limitations. ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch # Load the base model base_model = AutoModelForCausalLM.from_pretrained( "meta-llama/Meta-Llama-3-8B", torch_dtype=torch.bfloat16, # or torch.float16 depending on your hardware device_map="auto" ) # Load the LoRI adapter adapter = PeftModel.from_pretrained(base_model, "tomg-group-umd/LoRI-D_nlu_llama3_rank_64") # Load the tokenizer tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B") # Example usage for a general text generation task (adjust for specific NLU use-cases) prompt = "The quick brown fox jumps over the lazy dog." inputs = tokenizer(prompt, return_tensors="pt").to(adapter.device) # Generate text outputs = adapter.generate(**inputs, max_new_tokens=50, do_sample=True, temperature=0.7) generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) print(generated_text) # For specific NLU tasks, the prompt and expected output format would vary. # You would then apply relevant NLU processing to the generated text or use the adapter's output directly. ``` ## Training Details ### Training Data The LoRI models are trained on various datasets depending on the task: - **Natural Language Understanding (NLU):** Specific NLU datasets, as indicated by this model. - **Code generation:** CodeAlpaca dataset. - **Mathematical reasoning:** GSM8K dataset. - **Safety alignment:** Saferpaca dataset. More details on specific datasets can be found in the [GitHub repository](https://github.com/juzhengz/LoRI/). ### Training Procedure LoRI is implemented using Fully Sharded Data Parallel (FSDP) for multi-GPU training. The training involves two main stages: 1. **LoRI-D (Dense) training**: Adapters are trained with random projection matrices `A` frozen and `B` matrices dense. Sparse masks are then extracted. 2. **LoRI-S (Sparse) training**: Training continues with the extracted sparse masks applied to matrices `B`, typically at 90% sparsity. #### Training Hyperparameters - **Training regime:** Mixed precision (e.g., `bfloat16` for Llama-3) is typically used for training large models. - **Adapter Rank (`r`):** 64 (for this `LoRI-D_nlu_llama3_rank_64` model). - **LoRA Alpha (`lora_alpha`):** 128 (from `adapter_config.json`). - **LoRA Dropout (`lora_dropout`):** 0.05 (from `adapter_config.json`). - **Target Modules (`target_modules`):** `o_proj`, `k_proj`, `up_proj`, `q_proj`, `v_proj`, `down_proj`, `gate_proj` (from `adapter_config.json`). ## Evaluation ### Testing Data, Factors & Metrics LoRI's performance has been extensively evaluated across natural language understanding, mathematical reasoning, code generation (e.g., HumanEval), and safety alignment tasks. #### Metrics Performance is measured using relevant metrics for each task. The paper demonstrates that LoRI consistently outperforms full fine-tuning and existing PEFT methods across various tasks, while using up to 95% fewer trainable parameters than traditional LoRA. In multi-task experiments, LoRI enables effective adapter merging and continual learning with reduced cross-task interference. For detailed quantitative results, please refer to the [paper](https://arxiv.org/abs/2504.07448). ## Technical Specifications ### Model Architecture and Objective LoRI introduces a novel architecture where projection matrices `A` in LoRA are frozen as random projections, and matrices `B` are sparsified using task-specific masks. This design is intended to achieve monosemantic experts, reduce trainable parameters, and minimize cross-task interference. The objective remains focused on improving performance on downstream tasks while promoting parameter efficiency and modularity. ### Compute Infrastructure #### Hardware Training was performed in a multi-GPU environment using technologies like Fully Sharded Data Parallel (FSDP). #### Software The implementation uses Python, PyTorch, and the Hugging Face `transformers` and `peft` libraries. ## Acknowledgements This project builds on the codebase of [dpo-rlaif](https://github.com/architsharma97/dpo-rlaif) and incorporates code from [lottery-ticket-adaptation](https://github.com/kiddyboots216/lottery-ticket-adaptation). Code generation performance on HumanEval is evaluated using the [bigcode-evaluation-harness](https://github.com/bigcode-project/bigcode-evaluation-harness). ## Citation If you use LoRI in your work, please cite: ```bibtex @article{zhang2025lori, title={LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation}, author={Zhang, Juzheng and You, Jiacheng and Panda, Ashwinee and Goldstein, Tom}, journal={arXiv preprint arXiv:2504.07448}, year={2025} } ``` ## Model Card Contact For questions or inquiries, please refer to the contact information provided in the original [repository](https://github.com/juzhengz/LoRI/). ### Framework versions - PEFT 0.12.0