deep-div
/

MediLlama-3.2

Text Generation

text-generation-inference

Model card Files Files and versions Community

InferenceLab commited on May 16

Commit

01c9212

·

verified ·

1 Parent(s): a256681

Update README.md

Files changed (1) hide show

README.md +23 -7

README.md CHANGED Viewed

@@ -62,15 +62,31 @@ Users should validate outputs with certified medical professionals. This model i
 ## How to Get Started with the Model
 ```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model = AutoModelForCausalLM.from_pretrained("InferenceLab/MediLlama-3.2")
-tokenizer = AutoTokenizer.from_pretrained("InferenceLab/MediLlama-3.2")
-input_text = "What are the symptoms of diabetes?"
-inputs = tokenizer(input_text, return_tensors="pt")
-outputs = model.generate(**inputs)
-print(tokenizer.decode(outputs[0]))
 ````
 ## Training Details

 ## How to Get Started with the Model
 ```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_id = "InferenceLab/MediLlama-3.2"
+# Load tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16
+)
+# Apply the chat template
+messages = [
+    {"role": "system", "content": "You are a helpful AI assistant."},
+    {"role": "user", "content": "What are the symptoms of diabetes?"},
+]
+prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)
+# Generate response
+chat_outputs = model.generate(**chat_input, max_new_tokens=500)
+response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
+print("\nAssistant Response:", response)
 ````
 ## Training Details