--- library_name: peft model_name: Prompt2OpenSCENARIO-CodeLlama13B-LoRA tags: - lora - sft - transformers - trl license: other pipeline_tag: text-generation base_model: codellama/CodeLlama-13b-Instruct-hf --- # Prompt2OpenSCENARIO – CodeLlama-13B LoRA Fine-Tuned Model ## Model Overview This repository provides a **LoRA/QLoRA fine-tuned variant of CodeLlama-13B-Instruct**, specifically adapted for **autonomous driving scenario generation**. The model translates **natural language descriptions** of traffic scenes into **valid, schema-compliant OpenSCENARIO 1.0 XML files** suitable for execution in CARLA and similar simulators. - **Base model:** [CodeLlama-13B-Instruct](https://huggingface.co/codellama/CodeLlama-13b-Instruct-hf) - **Fine-tuning method:** Parameter-efficient fine-tuning (LoRA / QLoRA) - **Task:** Structured code generation (NL → XML) - **Domain:** Autonomous Driving Simulation - **Frameworks:** Hugging Face Transformers, PEFT, TRL, Datasets --- ## Quick Usage (Python) ```python import argparse, re, torch from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList, BitsAndBytesConfig from peft import PeftModel STOP_STR = "" class StopOnSubstrings(StoppingCriteria): def __init__(self, tok, substrings, start_len=0): self.tok = tok self.substrings = substrings self.start_len = start_len def set_start_len(self, n): self.start_len = n def __call__(self, input_ids, scores, **kw): gen_ids = input_ids[0, self.start_len:].tolist() if not gen_ids: return False text = self.tok.decode(gen_ids, skip_special_tokens=True) return any(s in text for s in self.substrings) def load_model_and_tokenizer(base_path: str, lora_id: str, use_4bit: bool): # Tokenizer: prefer the LoRA repo (to inherit chat template), fallback to base try: tok = AutoTokenizer.from_pretrained(lora_id, use_fast=True) except Exception: tok = AutoTokenizer.from_pretrained(base_path, use_fast=True) tok.padding_side = "right" if tok.pad_token is None: tok.pad_token = tok.eos_token quant_cfg = None device_map = None torch_dtype = torch.bfloat16 if use_4bit: quant_cfg = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.bfloat16, ) device_map = "auto" model = AutoModelForCausalLM.from_pretrained( base_path, torch_dtype=None if use_4bit else torch_dtype, quantization_config=quant_cfg, device_map=device_map, attn_implementation="sdpa", ) model = PeftModel.from_pretrained(model, lora_id) model.config.use_cache = True model.eval() model.config.pad_token_id = tok.pad_token_id model.config.eos_token_id = tok.eos_token_id return model, tok def generate_xosc(model, tok, template_path: str, system: str, user: str, max_new: int = 4000) -> str: tmpl = open(template_path, "r", encoding="utf-8").read() prompt = tmpl.format(system=system.strip(), user=user.strip()) if prompt.endswith("[/INST]"): prompt += "\n" enc = tok(prompt, return_tensors="pt") enc = {k: v.to(model.device) for k, v in enc.items()} stopper = StopOnSubstrings(tok, [STOP_STR]) stopper.set_start_len(enc["input_ids"].shape[1]) with torch.inference_mode(): out = model.generate( **enc, max_new_tokens=max_new, do_sample=False, temperature=0.0, top_p=1.0, pad_token_id=tok.pad_token_id, eos_token_id=tok.eos_token_id, stopping_criteria=StoppingCriteriaList([stopper]), return_dict_in_generate=True, ) gen_ids = out.sequences[0, enc["input_ids"].shape[1]:] txt = tok.decode(gen_ids, skip_special_tokens=True) m = re.search(r"", txt, flags=re.DOTALL) return m.group(0).strip() if m else txt if __name__ == "__main__": ap = argparse.ArgumentParser() ap.add_argument("--base") ap.add_argument("--lora", default="anto0699/Prompt2OpenSCENARIO-CodeLlama13B-LoRA") ap.add_argument("--template", default=r"prompt_templates\codellama_inst.txt") ap.add_argument("--use_4bit", action="store_true", help="Enable 4-bit loading (Option B). If omitted, use full-precision (Option A).") ap.add_argument("--max_new_tokens", type=int, default=4000) ap.add_argument("--system", default=( "Act as an OpenSCENARIO 1.0 generator for ADS testing in CARLA. " "I will give you a scene description in English and you must return one valid .xosc file, XML only, " "encoded in UTF-8, starting with and ending with . The file must be schema-compliant, " "and executable in CARLA without modifications. The scenario must include: the map (), " " with and , exactly one ego vehicle, any other entities with unique names, " "initial positions using , and a valid with deterministic triggers/events/actions. " "Use realistic defaults if details are missing (no randomness); no comments or extra text." )) ap.add_argument("--user", default="Write a minimal scenario with only one ego vehicle in Towns04, sunny environment.") args = ap.parse_args() model, tok = load_model_and_tokenizer(args.base, args.lora, args.use_4bit) xosc = generate_xosc(model, tok, args.template, args.system, args.user, args.max_new_tokens) print(xosc) ``` ## Intended Use - **Primary purpose:** Automatically generate deterministic OpenSCENARIO 1.0 files from plain-text descriptions of driving scenes. - **End users:** Researchers and engineers working on ADS/ADAS testing and simulation. - **Environment:** Optimized for **CARLA** but compatible with any OpenSCENARIO 1.0 compliant simulator. ### Example **Input (natural language):** On the ARG_Carcarana-10_1_T-1 map at 09:03:31 on April 18, 2025, the scene unfolds under cloudy skies with moderate snowfall and dense fog. The ego vehicle starts at (146.915, -12.991, 0.0) facing north. **Output (OpenSCENARIO 1.0):** ```xml ... ... ... ``` ## Dataset Read info here: [Prompt2OpenSCENARIO-LoRA-FineTune](https://huggingface.co/codellama/CodeLlama-13b-Instruct-hf) --- ## Training Procedure - **Method:** LoRA / QLoRA (rank=32, alpha=32, dropout=0.05) - **Target modules:** Attention and MLP projections (`q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj`) - **Batching:** Effective batch size = 8 (gradient accumulation with micro-batch size 1) - **Optimizer:** AdamW with cosine scheduler - **Learning rate:** 2e-4 - **Epochs:** 2 - **Max sequence length:** 4000 tokens - **Precision:** bfloat16 (mixed precision) - **Frameworks:** Hugging Face Transformers + TRL SFTTrainer + PEFT --- ## Citation If you use this model in your research, please cite: ```bibtex @misc{prompt2openscenario2025, title = {Empty Title}, author = {No Authors}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/anto0699/Prompt2OpenSCENARIO-CodeLlama13B-LoRA}} } ``` ## Licence This model inherits the license of the base model (CodeLlama) and is intended for research and educational purposes only.