aiplanet
/

LuxLlama

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

lucifertrj commited on May 15

Commit

1f713ac

·

verified ·

1 Parent(s): 50f2e6c

Update README.md

Files changed (1) hide show

README.md +0 -21

README.md CHANGED Viewed

@@ -85,28 +85,7 @@ The fine-tuning dataset was compiled from the following sources:
 *   **Base Model:** `unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit` loaded with 4-bit quantization (`load_in_4bit=True`).
 *   **Fine-tuning Method:** Supervised Fine-Tuning (SFT) using `trl.SFTTrainer`.
 *   **Parameter Efficiency:** PEFT with LoRA (`get_peft_model`).
-    *   `r`: 256
-    *   `lora_alpha`: 256
-    *   `lora_dropout`: 0.0
-    *   `target_modules`: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
 *   **Training Configuration (`SFTConfig`):**
-    *   `max_seq_length`: 128000
-    *   `packing`: False
-    *   `per_device_train_batch_size`: 4
-    *   `gradient_accumulation_steps`: 8 (Effective Batch Size: 32)
-    *   `warmup_ratio`: 0.02
-    *   `num_train_epochs`: 1
-    *   `learning_rate`: 5e-5
-    *   `fp16`: True
-    *   `bf16`: True (Mixed Precision Training)
-    *   `logging_steps`: 10
-    *   `optim`: "adamw_8bit"
-    *   `weight_decay`: 0.01
-    *   `lr_scheduler_type`: "cosine_with_restarts"
-    *   `seed`: 1729
-    *   `output_dir`: "lora_outputs_run5"
-    *   `save_strategy`: "steps"
-    *   `save_steps`: 1000
 *   **Optimization Kernel:** Liger kernel enabled (`use_liger=True`) for increased throughput and reduced memory usage via optimized Triton kernels for common LLM operations.
 ## Inference - vLLM

 *   **Base Model:** `unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit` loaded with 4-bit quantization (`load_in_4bit=True`).
 *   **Fine-tuning Method:** Supervised Fine-Tuning (SFT) using `trl.SFTTrainer`.
 *   **Parameter Efficiency:** PEFT with LoRA (`get_peft_model`).
 *   **Training Configuration (`SFTConfig`):**
 *   **Optimization Kernel:** Liger kernel enabled (`use_liger=True`) for increased throughput and reduced memory usage via optimized Triton kernels for common LLM operations.
 ## Inference - vLLM