--- license: mit datasets: - Replete-AI/code_bagel --- # Chatty-McChatterson-3-mini-128k ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6324ce4d5d0cf5c62c6e3c5a/zKJXnm52nly4viTzs0Ysa.png) ## Model Details **Model Name:** Chatty-McChatterson-3-mini-128k **Base Model:** [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) **Fine-tuning Method:** Supervised Fine-Tuning (SFT) **Dataset:** [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) **Training Data:** 12884 conversations selected for being 512 input tokens or less **Training Duration:** 4 hours **Hardware:** Nvidia RTX A4500 **Epochs:** 3 ## Training Procedure This model was fine-tuned to provide better instructions on code. The training was conducted using PEFT and SFTTrainer on select conversations from the Ultra Chat 200k dataset. Training was completed in 3 epochs (19326 steps) over a span of 4 hours on an Nvidia A4500 GPU. The dataset comprised of a filterd list of rows from the [Ultra Chat 200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) dataset, where the prompt template was 512 tokens or less. ## Intended Use This model is designed to improve the overall chat experience and response quality. ## Getting Started ## Instruct Template ```bash <|system|> {system_message} <|end|> <|user|> {Prompt) <|end|> <|assistant|> ``` ### Transfromers ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig model_name_or_path = "thesven/Chatty-McChatterson-3-mini-128k" # BitsAndBytesConfig for loading the model in 4-bit precision bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype="float16", ) tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True) model = AutoModelForCausalLM.from_pretrained( model_name_or_path, device_map="auto", trust_remote_code=False, revision="main", quantization_config=bnb_config ) model.pad_token = model.config.eos_token_id prompt_template = ''' <|user|> What is the name of the big tower in Toronto?.<|end|> <|assistant|> ''' input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda() output = model.generate(inputs=input_ids, temperature=0.1, do_sample=True, top_p=0.95, top_k=40, max_new_tokens=256) generated_text = tokenizer.decode(output[0, len(input_ids[0]):], skip_special_tokens=True) print(generated_text) ```