Prompt2OpenSCENARIO – CodeLlama-13B LoRA Fine-Tuned Model

Model Overview

This repository provides a LoRA/QLoRA fine-tuned variant of CodeLlama-13B-Instruct, specifically adapted for autonomous driving scenario generation.
The model translates natural language descriptions of traffic scenes into valid, schema-compliant OpenSCENARIO 1.0 XML files suitable for execution in CARLA and similar simulators.

  • Base model: CodeLlama-13B-Instruct
  • Fine-tuning method: Parameter-efficient fine-tuning (LoRA / QLoRA)
  • Task: Structured code generation (NL → XML)
  • Domain: Autonomous Driving Simulation
  • Frameworks: Hugging Face Transformers, PEFT, TRL, Datasets

Quick Usage (Python)

import argparse, re, torch
from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList, BitsAndBytesConfig
from peft import PeftModel

STOP_STR = "</OpenScenario>"

class StopOnSubstrings(StoppingCriteria):
    def __init__(self, tok, substrings, start_len=0):
        self.tok = tok
        self.substrings = substrings
        self.start_len = start_len

    def set_start_len(self, n):
        self.start_len = n

    def __call__(self, input_ids, scores, **kw):
        gen_ids = input_ids[0, self.start_len:].tolist()
        if not gen_ids:
            return False
        text = self.tok.decode(gen_ids, skip_special_tokens=True)
        return any(s in text for s in self.substrings)

def load_model_and_tokenizer(base_path: str, lora_id: str, use_4bit: bool):
    # Tokenizer: prefer the LoRA repo (to inherit chat template), fallback to base
    try:
        tok = AutoTokenizer.from_pretrained(lora_id, use_fast=True)
    except Exception:
        tok = AutoTokenizer.from_pretrained(base_path, use_fast=True)
    tok.padding_side = "right"
    if tok.pad_token is None:
        tok.pad_token = tok.eos_token

    quant_cfg = None
    device_map = None
    torch_dtype = torch.bfloat16

    if use_4bit:
        quant_cfg = BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_use_double_quant=True,
            bnb_4bit_compute_dtype=torch.bfloat16,
        )
        device_map = "auto"

    model = AutoModelForCausalLM.from_pretrained(
        base_path,
        torch_dtype=None if use_4bit else torch_dtype,
        quantization_config=quant_cfg,
        device_map=device_map,
        attn_implementation="sdpa",
    )
    model = PeftModel.from_pretrained(model, lora_id)
    model.config.use_cache = True
    model.eval()
    model.config.pad_token_id = tok.pad_token_id
    model.config.eos_token_id = tok.eos_token_id
    return model, tok

def generate_xosc(model, tok, template_path: str, system: str, user: str, max_new: int = 4000) -> str:
    tmpl = open(template_path, "r", encoding="utf-8").read()
    prompt = tmpl.format(system=system.strip(), user=user.strip())
    if prompt.endswith("[/INST]"):
        prompt += "\n"
    enc = tok(prompt, return_tensors="pt")
    enc = {k: v.to(model.device) for k, v in enc.items()}
    stopper = StopOnSubstrings(tok, [STOP_STR])
    stopper.set_start_len(enc["input_ids"].shape[1])

    with torch.inference_mode():
        out = model.generate(
            **enc,
            max_new_tokens=max_new,
            do_sample=False,
            temperature=0.0,
            top_p=1.0,
            pad_token_id=tok.pad_token_id,
            eos_token_id=tok.eos_token_id,
            stopping_criteria=StoppingCriteriaList([stopper]),
            return_dict_in_generate=True,
        )
    gen_ids = out.sequences[0, enc["input_ids"].shape[1]:]
    txt = tok.decode(gen_ids, skip_special_tokens=True)
    m = re.search(r"<OpenScenario\b.*?</OpenScenario>", txt, flags=re.DOTALL)
    return m.group(0).strip() if m else txt

if __name__ == "__main__":
    ap = argparse.ArgumentParser()
    ap.add_argument("--base")
    ap.add_argument("--lora", default="anto0699/Prompt2OpenSCENARIO-CodeLlama13B-LoRA")
    ap.add_argument("--template", default=r"prompt_templates\codellama_inst.txt")
    ap.add_argument("--use_4bit", action="store_true", help="Enable 4-bit loading (Option B). If omitted, use full-precision (Option A).")
    ap.add_argument("--max_new_tokens", type=int, default=4000)
    ap.add_argument("--system", default=(
        "Act as an OpenSCENARIO 1.0 generator for ADS testing in CARLA. "
        "I will give you a scene description in English and you must return one valid .xosc file, XML only, "
        "encoded in UTF-8, starting with <OpenScenario> and ending with </OpenScenario>. The file must be schema-compliant, "
        "and executable in CARLA without modifications. The scenario must include: the map (<RoadNetwork>), "
        "<Environment> with <TimeOfDay> and <Weather>, exactly one ego vehicle, any other entities with unique names, "
        "initial positions using <WorldPosition>, and a valid <Storyboard> with deterministic triggers/events/actions. "
        "Use realistic defaults if details are missing (no randomness); no comments or extra text."
    ))
    ap.add_argument("--user", default="Write a minimal scenario with only one ego vehicle in Towns04, sunny environment.")
    args = ap.parse_args()

    model, tok = load_model_and_tokenizer(args.base, args.lora, args.use_4bit)
    xosc = generate_xosc(model, tok, args.template, args.system, args.user, args.max_new_tokens)
    print(xosc)

Intended Use

  • Primary purpose: Automatically generate deterministic OpenSCENARIO 1.0 files from plain-text descriptions of driving scenes.
  • End users: Researchers and engineers working on ADS/ADAS testing and simulation.
  • Environment: Optimized for CARLA but compatible with any OpenSCENARIO 1.0 compliant simulator.

Example

Input (natural language): On the ARG_Carcarana-10_1_T-1 map at 09:03:31 on April 18, 2025, the scene unfolds under cloudy skies with moderate snowfall and dense fog. The ego vehicle starts at (146.915, -12.991, 0.0) facing north.

Output (OpenSCENARIO 1.0):

<OpenScenario>
  ...
  <Entities>
    <ScenarioObject name="ego_vehicle">
      <Vehicle name="vehicle.lincoln.mkz2017" vehicleCategory="car">
        ...
      </Vehicle>
    </ScenarioObject>
  </Entities>
  <Storyboard>
    ...
  </Storyboard>
</OpenScenario>

Dataset

Read info here: Prompt2OpenSCENARIO-LoRA-FineTune


Training Procedure

  • Method: LoRA / QLoRA (rank=32, alpha=32, dropout=0.05)
  • Target modules: Attention and MLP projections (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj)
  • Batching: Effective batch size = 8 (gradient accumulation with micro-batch size 1)
  • Optimizer: AdamW with cosine scheduler
  • Learning rate: 2e-4
  • Epochs: 2
  • Max sequence length: 4000 tokens
  • Precision: bfloat16 (mixed precision)
  • Frameworks: Hugging Face Transformers + TRL SFTTrainer + PEFT

Citation

If you use this model in your research, please cite:

@misc{prompt2openscenario2025,
  title        = {Empty Title},
  author       = {No Authors},
  year         = {2025},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/anto0699/Prompt2OpenSCENARIO-CodeLlama13B-LoRA}}
}

Licence

This model inherits the license of the base model (CodeLlama) and is intended for research and educational purposes only.

Downloads last month
83
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for anto0699/Prompt2OpenSCENARIO-CodeLlama13B-LoRA

Adapter
(24)
this model