Pretrained LM

Training Dataset

Prompt

  • Template:
      prompt = f"Translate this from {src_lang} to {tgt_lang}\n### {src_lang}: {src_text}\n### {tgt_lang}: "
    
      >>> # src_lang can be 'English', '한국어'
      >>> # tgt_lang can be '한국어', 'English'
    
    Mind that there is a "space (_)" at the end of the prompt (unpredictable first token will be popped up). But if you use vLLM, it's okay to remove the final space(_).

Training

  • Trained with QLoRA
    • PLM: NormalFloat 4-bit
    • Adapter: BrainFloat 16-bit
    • Adapted to all the linear layers (around 2.05%)
  • Merge adapters and upscaled in BrainFloat 16-bit precision

Usage (IMPORTANT)

  • Should remove the EOS token at the end of the prompt.
      # MODEL
      model_name = 'beomi/Llama-3-Open-Ko-8B'
      adapter_name = 'traintogpb/llama-3-enko-translator-8b-qlora-adapter'
      bnb_config = BitsAndBytesConfig(
          load_in_4bit=True,
          bnb_4bit_quant_type='nf4',
          bnb_4bit_compute_dtype=torch.bfloat16,
          bnb_4bit_use_double_quant=True
      )
      model = AutoModelForCausalLM.from_pretrained(
          model_name,
          max_length=768,
          quantization_config=bnb_config,
          attn_implementation='flash_attention_2',
          torch_dtype=torch.bfloat16,
      )
      model = PeftModel.from_pretrained(
          model,
          adapter_path=adapter_name,
          torch_dtype=torch.bfloat16,
      )
    
      tokenizer = AutoTokenizer.from_pretrained(adapter_name)
      tokenizer.pad_token_id = 128002 # eos_token_id and pad_token_id should be different
    
      text = "Someday, QWER will be the greatest girl band in the world."
      input_prompt = f"Translate this from English to 한국어.\n### English: {text}\n### 한국어:"
      inputs = tokenizer(input_prompt, max_length=768, truncation=True, return_tensors='pt')
    
      if inputs['input_ids'][0][-1] == tokenizer.eos_token_id:
          inputs['input_ids'] = inputs['input_ids'][0][:-1].unsqueeze(dim=0)
          inputs['attention_mask'] = inputs['attention_mask'][0][:-1].unsqueeze(dim=0)
    
      outputs = model.generate(**inputs, max_length=768, eos_token_id=tokenizer.eos_token_id)
    
      input_len = len(inputs['input_ids'].squeeze())
      translation = tokenizer.decode(outputs[0][input_len:], skip_special_tokens=True)
      print(translation)
    

Framework versions

  • PEFT 0.8.2
Downloads last month
15
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support translation models for peft library.

Model tree for traintogpb/llama-3-enko-translator-8b-qlora-adapter

Adapter
(8)
this model

Dataset used to train traintogpb/llama-3-enko-translator-8b-qlora-adapter

Collection including traintogpb/llama-3-enko-translator-8b-qlora-adapter