Text Generation
PEFT
Safetensors
lora
code-generation
llama

Improve model card for LoRI-S_code_llama3_rank_64

#1
by nielsr HF Staff - opened

This PR significantly enhances the model card for tomg-group-umd/LoRI-S_code_llama3_rank_64 by adding comprehensive information and improving its discoverability and usability.

Key updates include:

  • Metadata Enrichment: Adding license: apache-2.0, and relevant tags such as peft, lora, code-generation, and llama, along with the specific datasets used for training this model.
  • Detailed Model Description: Populating the "Model Details" section with information about the developers, model type, language, and the base model, based on the paper abstract and GitHub repository.
  • Complete Model Sources: Adding direct links to the official GitHub repository, the Hugging Face paper page, the project page, and the Hugging Face collection.
  • Elaborated Usage Instructions: Filling in "Uses" sections (Direct Use, Downstream Use, Out-of-Scope) to clarify the model's intended applications and limitations.
  • Executable Code Snippet: Providing a runnable Python code example in "How to Get Started" for quick inference using transformers and peft.
  • Training Information: Detailing the "Training Data" and "Training Procedure" (LoRI-D and LoRI-S stages, FSDP) and "Training Hyperparameters" (rank, sparsity, etc.).
  • Evaluation Summary: Summarizing key evaluation aspects and directing users to the paper for detailed results.
  • Citation: Including the BibTeX entry from the paper.
  • Visual Aid: Embedding the LoRI architecture diagram from the GitHub repository.

This update makes the model card much more informative and user-friendly for researchers and practitioners.

juzhengz changed pull request status to merged

Sign up or log in to comment