Evolution Learning Network (ELN) with QLoRA and Genetic Algorithms For LLM

Overview

This project implements an Evolution Learning Network (ELN) to fine-tune transformer-based models like LLaMA using a combination of Quantized Low-Rank Adaptation (QLoRA) and Genetic Algorithms (GA). The primary objective is to evolve a population of models across multiple generations to optimize for performance (fitness) and specialization, while maintaining diversity.

Key Features

  • Efficient model fine-tuning using QLoRA.
  • Evolutionary strategies, including random mutations and fitness-based selection.
  • Hardware-efficient training with 4-bit quantization.
  • Comprehensive experiment tracking with WandB.
  • Diversity maintenance through LoRA weight fingerprinting.

Model Details

Base Model

  • Name: meta-llama/Llama-3.2-1B (can be replaced with any Hugging Face model).
  • Architecture: Transformer-based causal language model.

Quantization Configuration

  • Quantization Type: 4-bit using bitsandbytes (bnb_4bit).
  • Parameters:
    • Compute Type: torch.float16
    • Quantization Type: "nf4" (Nonlinear quantization).
    • Double Quantization: Enabled.
    • Nested Quantization: Enabled.

LoRA (Low-Rank Adaptation)

  • Dimensions (r): 8
  • Alpha (Scaling): 16
  • Target Modules: Query and Value projections (q_proj, v_proj).
  • Dropout: 0.05
  • Task Type: Causal Language Modeling (CAUSAL_LM).

Training Strategy

  • Optimizer: paged_adamw_8bit for memory-efficient updates.
  • Precision: Mixed precision (fp16) for faster training.

Hyperparameters

General Parameters

  • Generations: 10
  • Population Size: 4
  • Dataset Size: 2000 samples per split (adjustable for larger datasets).

Training

  • Batch Size: 8
  • Gradient Accumulation: 16 steps.
  • Learning Rate: 2e-4
  • Epochs per Model: 2

Mutations

  • Mutation Rate: 10% (probability per parameter).
  • Mutation Scale: Noise added with a standard deviation of 0.02.

Dataset Details

Source

  • Name: WikiText (wikitext-2-raw-v1 for larger datasets).
  • Splits:
    • train โ†’ Model training.
    • validation โ†’ General task evaluation.
    • test โ†’ Specific task evaluation.

Tokenization

  • Tokenizer: Hugging Face AutoTokenizer.
  • Max Token Length: 128 tokens.
  • Padding: Fixed to "max_length".

Results

Summary

  • Total Generations: 10
  • Best Fitness Achieved: 0.4772
  • Final Population Diversity: 0.0011

Evolution History (Highlights)

Generation Best Fitness Avg Fitness Diversity Best Specialization
1 0.4096 0.4023 0.00097 0.9967
5 0.4727 0.4722 0.00099 0.9968
10 0.4772 0.4768 0.00106 0.9972

Hardware & Framework

Hardware

  • Multi-GPU support with torch.nn.parallel.DistributedDataParallel or Accelerator.
  • Logs GPU/CPU usage with psutil and torch.cuda.

Frameworks & Libraries

  • Transformers: Hugging Face model and tokenizer handling.
  • Datasets: Data loading and processing.
  • WandB: Experiment tracking and visualization.
  • BitsAndBytes: 4-bit quantization.
  • PEFT: LoRA-based fine-tuning.

Future Work

  • Explore larger population sizes and more generations for enhanced diversity.
  • Experiment with other datasets to generalize findings.
  • Integrate additional mutation strategies for broader exploration.

Citation

Remaining


Code to run locally

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
model = PeftModel.from_pretrained(base_model, "diabolic6045/ELN-AOC-CAIN")

Framework versions

  • PEFT 0.14.0

~ diabolic6045

Downloads last month
13
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for AIRRC/ELN-AOC-CAIN

Adapter
(265)
this model