Prompt2OpenSCENARIO – CodeLlama-13B LoRA Fine-Tuned Model

Model Overview

This repository provides a LoRA/QLoRA fine-tuned variant of CodeLlama-13B-Instruct, specifically adapted for autonomous driving scenario generation.
The model translates natural language descriptions of traffic scenes into valid, schema-compliant OpenSCENARIO 1.0 XML files suitable for execution in CARLA and similar simulators.

Base model: CodeLlama-13B-Instruct
Fine-tuning method: Parameter-efficient fine-tuning (LoRA / QLoRA)
Task: Structured code generation (NL → XML)
Domain: Autonomous Driving Simulation
Frameworks: Hugging Face Transformers, PEFT, TRL, Datasets

Quick Usage (Python)

import argparse, re, torch
from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList, BitsAndBytesConfig
from peft import PeftModel

STOP_STR = "</OpenScenario>"

class StopOnSubstrings(StoppingCriteria):
    def __init__(self, tok, substrings, start_len=0):
        self.tok = tok
        self.substrings = substrings
        self.start_len = start_len

    def set_start_len(self, n):
        self.start_len = n

    def __call__(self, input_ids, scores, **kw):
        gen_ids = input_ids[0, self.start_len:].tolist()
        if not gen_ids:
            return False
        text = self.tok.decode(gen_ids, skip_special_tokens=True)
        return any(s in text for s in self.substrings)

def load_model_and_tokenizer(base_path: str, lora_id: str, use_4bit: bool):
    # Tokenizer: prefer the LoRA repo (to inherit chat template), fallback to base
    try:
        tok = AutoTokenizer.from_pretrained(lora_id, use_fast=True)
    except Exception:
        tok = AutoTokenizer.from_pretrained(base_path, use_fast=True)
    tok.padding_side = "right"
    if tok.pad_token is None:
        tok.pad_token = tok.eos_token

    quant_cfg = None
    device_map = None
    torch_dtype = torch.bfloat16

    if use_4bit:
        quant_cfg = BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_use_double_quant=True,
            bnb_4bit_compute_dtype=torch.bfloat16,
        )
        device_map = "auto"

    model = AutoModelForCausalLM.from_pretrained(
        base_path,
        torch_dtype=None if use_4bit else torch_dtype,
        quantization_config=quant_cfg,
        device_map=device_map,
        attn_implementation="sdpa",
    )
    model = PeftModel.from_pretrained(model, lora_id)
    model.config.use_cache = True
    model.eval()
    model.config.pad_token_id = tok.pad_token_id
    model.config.eos_token_id = tok.eos_token_id
    return model, tok

def generate_xosc(model, tok, template_path: str, system: str, user: str, max_new: int = 4000) -> str:
    tmpl = open(template_path, "r", encoding="utf-8").read()
    prompt = tmpl.format(system=system.strip(), user=user.strip())
    if prompt.endswith("[/INST]"):
        prompt += "\n"
    enc = tok(prompt, return_tensors="pt")
    enc = {k: v.to(model.device) for k, v in enc.items()}
    stopper = StopOnSubstrings(tok, [STOP_STR])
    stopper.set_start_len(enc["input_ids"].shape[1])

    with torch.inference_mode():
        out = model.generate(
            **enc,
            max_new_tokens=max_new,
            do_sample=False,
            temperature=0.0,
            top_p=1.0,
            pad_token_id=tok.pad_token_id,
            eos_token_id=tok.eos_token_id,
            stopping_criteria=StoppingCriteriaList([stopper]),
            return_dict_in_generate=True,
        )
    gen_ids = out.sequences[0, enc["input_ids"].shape[1]:]
    txt = tok.decode(gen_ids, skip_special_tokens=True)
    m = re.search(r"<OpenScenario\b.*?</OpenScenario>", txt, flags=re.DOTALL)
    return m.group(0).strip() if m else txt

if __name__ == "__main__":
    ap = argparse.ArgumentParser()
    ap.add_argument("--base")
    ap.add_argument("--lora", default="anto0699/Prompt2OpenSCENARIO-CodeLlama13B-LoRA")
    ap.add_argument("--template", default=r"prompt_templates\codellama_inst.txt")
    ap.add_argument("--use_4bit", action="store_true", help="Enable 4-bit loading (Option B). If omitted, use full-precision (Option A).")
    ap.add_argument("--max_new_tokens", type=int, default=4000)
    ap.add_argument("--system", default=(
        "Act as an OpenSCENARIO 1.0 generator for ADS testing in CARLA. "
        "I will give you a scene description in English and you must return one valid .xosc file, XML only, "
        "encoded in UTF-8, starting with <OpenScenario> and ending with </OpenScenario>. The file must be schema-compliant, "
        "and executable in CARLA without modifications. The scenario must include: the map (<RoadNetwork>), "
        "<Environment> with <TimeOfDay> and <Weather>, exactly one ego vehicle, any other entities with unique names, "
        "initial positions using <WorldPosition>, and a valid <Storyboard> with deterministic triggers/events/actions. "
        "Use realistic defaults if details are missing (no randomness); no comments or extra text."
    ))
    ap.add_argument("--user", default="Write a minimal scenario with only one ego vehicle in Towns04, sunny environment.")
    args = ap.parse_args()

    model, tok = load_model_and_tokenizer(args.base, args.lora, args.use_4bit)
    xosc = generate_xosc(model, tok, args.template, args.system, args.user, args.max_new_tokens)
    print(xosc)

Intended Use

Primary purpose: Automatically generate deterministic OpenSCENARIO 1.0 files from plain-text descriptions of driving scenes.
End users: Researchers and engineers working on ADS/ADAS testing and simulation.
Environment: Optimized for CARLA but compatible with any OpenSCENARIO 1.0 compliant simulator.

Example

Input (natural language): On the ARG_Carcarana-10_1_T-1 map at 09:03:31 on April 18, 2025, the scene unfolds under cloudy skies with moderate snowfall and dense fog. The ego vehicle starts at (146.915, -12.991, 0.0) facing north.

Output (OpenSCENARIO 1.0):

<OpenScenario>
  ...
  <Entities>
    <ScenarioObject name="ego_vehicle">
      <Vehicle name="vehicle.lincoln.mkz2017" vehicleCategory="car">
        ...
      </Vehicle>
    </ScenarioObject>
  </Entities>
  <Storyboard>
    ...
  </Storyboard>
</OpenScenario>

Dataset

Read info here: Prompt2OpenSCENARIO-LoRA-FineTune

Training Procedure

Method: LoRA / QLoRA (rank=32, alpha=32, dropout=0.05)
Target modules: Attention and MLP projections (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj)
Batching: Effective batch size = 8 (gradient accumulation with micro-batch size 1)
Optimizer: AdamW with cosine scheduler
Learning rate: 2e-4
Epochs: 2
Max sequence length: 4000 tokens
Precision: bfloat16 (mixed precision)
Frameworks: Hugging Face Transformers + TRL SFTTrainer + PEFT

Citation

If you use this model in your research, please cite:

@misc{prompt2openscenario2025,
  title        = {Empty Title},
  author       = {No Authors},
  year         = {2025},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/anto0699/Prompt2OpenSCENARIO-CodeLlama13B-LoRA}}
}

Licence

This model inherits the license of the base model (CodeLlama) and is intended for research and educational purposes only.

Downloads last month: 83

Model tree for anto0699/Prompt2OpenSCENARIO-CodeLlama13B-LoRA

Base model

codellama/CodeLlama-13b-Instruct-hf

Adapter

(24)

this model