PEFT
Safetensors
Hindi
File size: 3,697 Bytes
2955fd3
 
 
b875dd6
 
 
a48c4d0
 
2955fd3
 
 
 
 
 
2d97f49
 
b4750f7
2d97f49
2955fd3
 
 
503c14a
 
 
2d97f49
 
 
2e3fcf8
2d97f49
2e3fcf8
b4ea95c
2e3fcf8
2d97f49
59bea88
 
 
 
 
 
 
 
 
cd0b601
59bea88
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2955fd3
59bea88
 
 
 
 
b875dd6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
base_model: unsloth/gemma-2-9b-bnb-4bit
library_name: peft
license: mit
language:
- hi
datasets:
- FreedomIntelligence/alpaca-gpt4-hindi
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->

Gemma2 HindiChat

This version of Gemma2.0 was finetuned on Colab notebook with L4 GPU for Hindi language tasks. The inference code below includes data preprocessing and evaluation pipelines tailored for Hindi language generation and 
understanding.

## Model Details

The base model, unsloth/gemma-2-9b, supports RoPE scaling, 4-bit quantization for memory efficiency, and fine-tuning with LoRA (Low-Rank Adaptation). Flash Attention 2 is utilized to enable softcapping and improve efficiency during training.

## Prompt Design

A custom Hindi Alpaca style Prompt Template is designed to format instructions, inputs, and expected outputs in a conversational structure.

निर्देश:{instruction}

इनपुट:{input}

उत्तर:{response}

```python
# Install required libraries
!pip install peft accelerate bitsandbytes

from peft import PeftModel, PeftConfig
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch

# Load the configuration for the fine-tuned model
model_id = "Vijayendra/Gemma2.0-9B-HindiChat"
config = PeftConfig.from_pretrained(model_id)

# Load the base model and the fine-tuned model
base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
model = PeftModel.from_pretrained(base_model, model_id)

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

# Define the prompt template (same as used during training)
hindi_alpaca_prompt = """नीचे एक निर्देश दिया गया है जो एक कार्य का वर्णन करता है, और इसके साथ एक इनपुट है जो अतिरिक्त संदर्भ प्रदान करता है। एक उत्तर लिखें जो अनुरोध को सही ढंग से पूरा करता हो।

### निर्देश:
{}

### इनपुट:
{}

### उत्तर:
{}"""

# Define new prompts for inference
prompts = [
    hindi_alpaca_prompt.format("भारत के स्वतंत्रता संग्राम में महात्मा गांधी की भूमिका क्या थी?", "", ""),
    hindi_alpaca_prompt.format("सौर मंडल में कौन सा ग्रह सबसे छोटा है?", "", ""),
    hindi_alpaca_prompt.format("पानी का रासायनिक सूत्र क्या है?", "", ""),
    hindi_alpaca_prompt.format("हिमालय पर्वत श्रृंखला की विशेषताएँ बताइए।", "", ""),
    hindi_alpaca_prompt.format("प्रकाश संश्लेषण की प्रक्रिया क्या है?", "", "")
]

# Tokenize the prompts
inputs = tokenizer(prompts, return_tensors="pt", padding=True, truncation=True).to("cuda")

# Generate responses
outputs = model.generate(**inputs, max_new_tokens=512,do_sample=True,
    temperature=0.7,top_k=50, use_cache=True)

# Decode and print the responses
responses = tokenizer.batch_decode(outputs, skip_special_tokens=True)
for i, response in enumerate(responses):
    print(f"प्रश्न {i+1}: {prompts[i].split('### निर्देश:')[1].split('### इनपुट:')[0].strip()}")
    print(f"उत्तर: {response.split('### उत्तर:')[1].strip()}")
    print("-" * 50)