Update README.md

2e3fcf8 verified about 1 month ago

3.7 kB

	---
	base_model: unsloth/gemma-2-9b-bnb-4bit
	library_name: peft
	license: mit
	language:
	- hi
	datasets:
	- FreedomIntelligence/alpaca-gpt4-hindi
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	Gemma2 HindiChat

	This version of Gemma2.0 was finetuned on Colab notebook with L4 GPU for Hindi language tasks. The inference code below includes data preprocessing and evaluation pipelines tailored for Hindi language generation and
	understanding.

	## Model Details

	The base model, unsloth/gemma-2-9b, supports RoPE scaling, 4-bit quantization for memory efficiency, and fine-tuning with LoRA (Low-Rank Adaptation). Flash Attention 2 is utilized to enable softcapping and improve efficiency during training.

	## Prompt Design

	A custom Hindi Alpaca style Prompt Template is designed to format instructions, inputs, and expected outputs in a conversational structure.

	निर्देश:{instruction}

	इनपुट:{input}

	उत्तर:{response}

	```python
	# Install required libraries
	!pip install peft accelerate bitsandbytes

	from peft import PeftModel, PeftConfig
	from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
	import torch

	# Load the configuration for the fine-tuned model
	model_id = "Vijayendra/Gemma2.0-9B-HindiChat"
	config = PeftConfig.from_pretrained(model_id)

	# Load the base model and the fine-tuned model
	base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
	model = PeftModel.from_pretrained(base_model, model_id)

	# Load the tokenizer
	tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

	# Define the prompt template (same as used during training)
	hindi_alpaca_prompt = """नीचे एक निर्देश दिया गया है जो एक कार्य का वर्णन करता है, और इसके साथ एक इनपुट है जो अतिरिक्त संदर्भ प्रदान करता है। एक उत्तर लिखें जो अनुरोध को सही ढंग से पूरा करता हो।

	### निर्देश:
	{}

	### इनपुट:
	{}

	### उत्तर:
	{}"""

	# Define new prompts for inference
	prompts = [
	hindi_alpaca_prompt.format("भारत के स्वतंत्रता संग्राम में महात्मा गांधी की भूमिका क्या थी?", "", ""),
	hindi_alpaca_prompt.format("सौर मंडल में कौन सा ग्रह सबसे छोटा है?", "", ""),
	hindi_alpaca_prompt.format("पानी का रासायनिक सूत्र क्या है?", "", ""),
	hindi_alpaca_prompt.format("हिमालय पर्वत श्रृंखला की विशेषताएँ बताइए।", "", ""),
	hindi_alpaca_prompt.format("प्रकाश संश्लेषण की प्रक्रिया क्या है?", "", "")
	]

	# Tokenize the prompts
	inputs = tokenizer(prompts, return_tensors="pt", padding=True, truncation=True).to("cuda")

	# Generate responses
	outputs = model.generate(**inputs, max_new_tokens=512,do_sample=True,
	temperature=0.7,top_k=50, use_cache=True)

	# Decode and print the responses
	responses = tokenizer.batch_decode(outputs, skip_special_tokens=True)
	for i, response in enumerate(responses):
	print(f"प्रश्न {i+1}: {prompts[i].split('### निर्देश:')[1].split('### इनपुट:')[0].strip()}")
	print(f"उत्तर: {response.split('### उत्तर:')[1].strip()}")
	print("-" * 50)