---
language:
- en
license: mit
library_name: llama.cpp
tags:
- nlp
- information-extraction
- event-detection
datasets:
- custom_dataset
metrics:
- accuracy
base_model: deepseek-ai/deepseek-r1-distill-qwen-1.5b
model-index:
- name: DeepSeek R1 Distill Qwen 1.5B - Fine-tuned Version (Q4_K_M.gguf)
results:
- task:
type: text-generation
name: Actionable Information Extraction
dataset:
type: custom_dataset
name: Custom Dataset for Event & Bullet Extraction
metrics:
- type: latency
value: 33.78
name: Prompt Eval Time (ms per token)
args:
device: 3090 TI
---
# DeepSeek R1 Distill Qwen 1.5B - Fine-tuned Version (Q4_K_M.gguf)
## Overview
This is a fine-tuned version of **DeepSeek R1 Distill Qwen 1.5B**, optimized for extracting actionable insights and scheduling events from conversations. The model has undergone **2500 steps / 9 epochs** of fine-tuning with **2194 examples**, ensuring high accuracy and efficiency in structured information extraction.
## Model Details
- **Base Model:** DeepSeek R1 Distill Qwen 1.5B
- **Fine-tuning Steps:** 2500
- **Epochs:** 9
- **Dataset Size:** 2194 examples
- **License:** MIT
- **File Format:** GGUF
- **Released Version:** Q4_K_M.gguf
| Metric | **3090 Ti** | **Raspberry Pi 5** |
|-----------------|-------------------------------|-------------------------------|
| **Prompt Eval Time** | 33.78 ms / 406 tokens (0.08 ms per token, 12017.88 tokens/sec) | 17831.25 ms / 535 tokens (33.33 ms per token, 30.00 tokens/sec) |
| **Eval Time** | 7133.93 ms / 1694 tokens (4.21 ms per token, 237.46 tokens/sec) | 52006.54 ms / 529 tokens (98.31 ms per token, 10.17 tokens/sec) |
| **Total Time** | 7167.72 ms / 2100 tokens | 70881.95 ms / 1064 tokens |
| **Decoding Speed** | N/A | 529 tokens in 70.40s (7.51 tokens/sec) |
| **Sampling Speed** | N/A | 149.33 ms / 530 runs (0.28 ms per token, 3549.26 tokens/sec) |
### **Observations:**
- The **3090 Ti** is significantly faster, handling **12017.88 tokens/sec** for prompt evaluation, compared to **30 tokens/sec** on the **Pi 5**.
- In token evaluation, the **3090 Ti** manages **237.46 tokens/sec**, whereas the **Pi 5** achieves just **10.17 tokens/sec**.
- The **Pi 5**'s total execution time (70.88s) is close to the **3090 Ti**, but it processes far fewer tokens in that time.
## Usage Instructions
### System Prompt
To use this model effectively, initialize it with the following system prompt:
```
### Instruction:
Purpose:
Extract actionable information from the provided dialog and metadata, generating bullet points with importance rankings and identifying relevant calendar events.
### Steps:
1. **Context Analysis:**
- Use `CurrentDateTime` to interpret relative time references (e.g., "tomorrow").
- Prioritize key information based on `InformationRankings`:
- Higher rank values indicate more emphasis on that aspect.
2. **Bullet Points:**
- Summarize key points concisely.
- Assign an importance rank (1-100).
- Format: `"[Summary]"[1-100]`
3. **Event Detection:**
- Identify and structure events with clear scheduling details.
- Format:
`EventTitle:"[Title]",StartDate:"[YYYY-MM-DD,HH:MM]",EndDate:"[YYYY-MM-DD,HH:MM or N/A]",Recurrence:"[Daily/Weekly/Monthly or N/A]",Details:"[Summary]"`
4. **Filtering:**
- Exclude vague, non-actionable statements.
- Only create events for clear, actionable scheduling needs.
5. **Output Consistency:**
- Follow the exact XML format.
- Ensure structured, relevant output.
ONLY REPLY WITH THE XML AFTER YOU END THINK.
Dialog: "{conversations}"
CurrentDateTime: "{date_and_time_and_day}"
InformationRankings: "{information_rankings}"
```
## How to Run the Model
### Using llama.cpp
If you are using `llama.cpp`, run the model with:
```bash
./main -m Q4_K_M.gguf --prompt "" --temp 0.7 --n-gpu-layers 50
```
### Using Text Generation WebUI
1. Download and place the `Q4_K_M.gguf` file in the models folder.
2. Start the WebUI:
```bash
python server.py --model Q4_K_M.gguf
```
3. Use the system prompt above for structured output.
## Expected Output Format
Example response when processing a conversation:
```xml
"Team meeting scheduled for next Monday to finalize project details."85
EventTitle:"Team Meeting",StartDate:"2025-06-10,10:00",EndDate:"2025-06-10,11:00",Recurrence:"N/A",Details:"Finalizing project details with the team."
```
## License
This model is released under the **MIT License**, allowing free usage, modification, and distribution.
## Contact & Support
For any inquiries or support, please visit [Hugging Face Discussions](https://huggingface.co/unsloth/DeepSeek-R1-GGUF) or open an issue on the repository.