xmanii
/

Llama3-8b-simorgh-16bit

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

xmanii commited on Jun 17, 2024

Commit

514b017

·

verified ·

1 Parent(s): 7d15c39

Update README.md

Files changed (1) hide show

README.md +33 -11

README.md CHANGED Viewed

@@ -1,17 +1,39 @@
-Model Information
-Developed by: xmanii License: Apache-2.0 Finetuned from model: unsloth/llama-3-8b-instruct-bnb-4bit
-This LLaMA model was fine-tuned on a unique Persian dataset of Alpaca chat conversations, consisting of approximately 8,000 rows. Our training process utilized two H100 GPUs, completing in just under 1 hour. We leveraged the power of Unsloth and Hugging Face's TRL library to accelerate our training process by 2x.
-<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>
-This model is open-source, and we invite the community to use and build upon our work. The fine-tuned LLaMA model is designed to improve Persian conversation capabilities, and we hope it will contribute to the advancement of natural language processing in the Persian language.
-Using the Model
-To use this model, you can utilize the Hugging Face Transformers library. Note: The default usage code provided by Hugging Face is not applicable for this model. Instead, follow the example below:
-messages = [    {"from": "human", "value": prompt},]
-Finally, use the pipeline to generate responses:
-pipe = pipeline("text-generation", model="xmanii/Llama3-8b-simorgh-16bit")
-pipe(messages)

+model-index:
+  - name: xmanii/llama-3-8b-instruct-bnb-4bit-persian
+    description: |
+      **Model Information**
+      **Developed by:** xmanii
+      **License:** Apache-2.0
+      **Finetuned from model:** unsloth/llama-3-8b-instruct-bnb-4bit
+      **Model Description**
+      This LLaMA model was fine-tuned on a unique Persian dataset of Alpaca chat conversations, consisting of approximately 8,000 rows. Our training process utilized two H100 GPUs, completing in just under 1 hour. We leveraged the power of Unsloth and Hugging Face's TRL library to accelerate our training process by 2x.
+      **Open-Source Contribution**
+      This model is open-source, and we invite the community to use and build upon our work. The fine-tuned LLaMA model is designed to improve Persian conversation capabilities, and we hope it will contribute to the advancement of natural language processing in the Persian language.
+      **Using the Model**
+      To use this model, you can utilize the Hugging Face Transformers library. **Note:** The default usage code provided by Hugging Face is not applicable for this model. Instead, follow the example below:
+      ```python
+      messages = [{"from": "human", "value": prompt},]
+      ```
+      Finally, use the pipeline to generate responses:
+      ```python
+      pipe = pipeline("text-generation", model="xmanii/Llama3-8b-simorgh-16bit")
+      pipe(messages)
+      ```
+      **Full 16-bit Merged Model**
+      For a full 16-bit merged model, please check out xmanii/Llama3-8b-simorgh-16bit.
+      **Future Work**
+      We are working on quantizing the models and bringing them to ollama.